Fault tolerant app scheme

FedorPRO · August 12, 2021, 10:30am

Hello everyone! I have 3 large monolithic applications (LNPP Stack). At this moment, the databases are moved to HA clusters. And i need just run APPs in docker containers, separately.
I’m trying to understand how to realize this scheme : 1 application - one nomad client + 1 node nomad client for a backup launch of any of the applications in case of problems with their nodes.
I try using “affinity”, but after job migrate to reserve node they can’t migrate back, after main app nomad node back online.

# APP-01 node
  affinity { 
    attribute = "${attr.unique.hostname}"
    value     = "nomad-node-01"
    weight    = 100
  }
# other APP node
  affinity {
    attribute = "${attr.unique.hostname}"
    value     = "nomad-node-02"
    weight    = -100
  }
# other APP node
  affinity {
    attribute = "${attr.unique.hostname}"
    value     = "nomad-node-03"
    weight    = -100
  }
# Reserve node
  affinity {
    attribute = "${attr.unique.hostname}"
    value     = "nomad-node-04"
    weight    = 50
  }

What is the best way to do in this scheme? Should I use “constraint”, “reschedule” (i tried but failed)

Thanks!

lgfa29 · August 13, 2021, 7:12pm

Hi @FedorPRO

Once Nomad places an allocation it will remain in that client until you actively act on it, like registering a new version, draining the node etc. I think the only automated process that would cause an allocation to move is preemption which doesn’t apply to your case.

One of the roles of an orchestrator is to try to abstract the infrastructure details from operators, so trying to be super specific with how scheduling should be done is often more complicated.

Are your client configuration all the same? Meaning, does it really mater if job1 always runs in the APP 01 client?

If not, I think you could use oversized resource levels as a workaround to prevent Nomad from scheduling anything else in a client that is already running a job.

For example, if your clients have 16GB of memory, you could set your job to something like this:

job "app1" {
  # ...
  group "app1" {
    # ...
    task "app1" {
    # ...
    resource {
      memory = 10000 # 10GB
    }
  }
}

FedorPRO · August 23, 2021, 9:27am

Hello! Thanks for reply. I will keep in mind your advice.

after receiving clarifications from the programmers - I have 1 large application and 2 small ones and I need them all to be on different nomad clients

lgfa29 · August 27, 2021, 3:27pm

Right, the applications have different requirements, but are the client machines different? Meaning, do you have some machines with more memory than others for example?

If not, you could just use resource to claim an entire client for each app, so that no other can be scheduled along side it. Then you don’t need to worry about micro-managing the Nomad scheduler

Topic		Replies	Views
Node affinity deploying Jobs Nomad nomad	0	15	March 6, 2025
Understanding job restart behaviour on lost jobs Nomad	2	1199	May 12, 2022
Schedule tasks on the same node, but configure them independently Nomad	9	48	December 5, 2024
Deploy a service on particular physical server Nomad	4	699	April 9, 2020
Reboot and maintenance service for client nodes Nomad	2	426	May 3, 2023

Fault tolerant app scheme

Related topics