Fault tolerant app scheme

Hello everyone! I have 3 large monolithic applications (LNPP Stack). At this moment, the databases are moved to HA clusters. And i need just run APPs in docker containers, separately.
I’m trying to understand how to realize this scheme : 1 application - one nomad client + 1 node nomad client for a backup launch of any of the applications in case of problems with their nodes.
I try using “affinity”, but after job migrate to reserve node they can’t migrate back, after main app nomad node back online.

# APP-01 node
  affinity { 
    attribute = "${attr.unique.hostname}"
    value     = "nomad-node-01"
    weight    = 100
  }
# other APP node
  affinity {
    attribute = "${attr.unique.hostname}"
    value     = "nomad-node-02"
    weight    = -100
  }
# other APP node
  affinity {
    attribute = "${attr.unique.hostname}"
    value     = "nomad-node-03"
    weight    = -100
  }
# Reserve node
  affinity {
    attribute = "${attr.unique.hostname}"
    value     = "nomad-node-04"
    weight    = 50
  }

What is the best way to do in this scheme? Should I use “constraint”, “reschedule” (i tried but failed)

Thanks!

Hi @FedorPRO :wave:

Once Nomad places an allocation it will remain in that client until you actively act on it, like registering a new version, draining the node etc. I think the only automated process that would cause an allocation to move is preemption which doesn’t apply to your case.

One of the roles of an orchestrator is to try to abstract the infrastructure details from operators, so trying to be super specific with how scheduling should be done is often more complicated.

Are your client configuration all the same? Meaning, does it really mater if job1 always runs in the APP 01 client?

If not, I think you could use oversized resource levels as a workaround to prevent Nomad from scheduling anything else in a client that is already running a job.

For example, if your clients have 16GB of memory, you could set your job to something like this:

job "app1" {
  # ...
  group "app1" {
    # ...
    task "app1" {
    # ...
    resource {
      memory = 10000 # 10GB
    }
  }
}
1 Like

Hello! Thanks for reply. I will keep in mind your advice.

after receiving clarifications from the programmers - I have 1 large application and 2 small ones and I need them all to be on different nomad clients

Right, the applications have different requirements, but are the client machines different? Meaning, do you have some machines with more memory than others for example?

If not, you could just use resource to claim an entire client for each app, so that no other can be scheduled along side it. Then you don’t need to worry about micro-managing the Nomad scheduler :slightly_smiling_face: