Rolling updates with distinct_hosts constraint

SunSparc · October 7, 2021, 6:41pm

I am running an application (haproxy) that needs to always be available to end users. I typically have 3 instances of the application running per cluster/datacenter. I prefer to have each instance on a separate node/host, to protect against node problems affecting multiple instances. An abbreviated example of my setup looks like this:

job "loadbalancer" {
  group "haproxy" {
    count = 3
    constraint { distinct_hosts = true }
    constraint {
      attribute = node.class
      value     = "haproxy"
    }
  }
  task "haproxy" {
    ...
  }
}

Now I want to add rolling updates into the mix so that when there is a change to apply it does not cause the service to be offline at all. Let us say that I only have 3 nodes. If I add the update stanza to my job like this:

job "loadbalancer" {
  update {
    max_parallel = 1
    auto_revert = true
    auto_promote = true
    canary = 1
  }
  group "haproxy" {
    count = 3
    constraint { distinct_hosts = true }
    constraint {
      attribute = node.class
      value     = "haproxy"
    }
  }

  task "haproxy" {
    ...
  }
}

… I will obviously have irreconcilable constraints which cause the job to be unplaceable.

Class haproxy filtered 3 nodes
Constraint ${node.class} = haproxy filtered 1 node
Constraint distinct_hosts filtered 3 nodes

If I have one extra node in the datacenter that allows this application on it, then I think the problem goes away. However, I am not sure I can guarantee that scenario and would like to have a contingency plan.

I am looking at the scaling stanza to see if I can leverage that. I am also considering changing the distinct_hosts constraint to use the affinity stanza instead.

Any suggestions would be helpful.

shantanugadgil · October 8, 2021, 8:30am

(just a thought off the top of my head; I could be wrong, feel free to ignore altogether)

If a prestop lifecycle hook existed, could that help?
(canary would not be needed then, maybe?)

ref: (pre-)Stop/Kill action/command · Issue #9872 · hashicorp/nomad · GitHub

gary.bright · November 15, 2024, 11:58am

Hi there, did you come up with a solution to your problem, I’ve come up against the same thing.

TIA

SunSparc · November 15, 2024, 4:54pm

Our current solution is to have individual jobs for each instance instead of having multiple instances of the same job. Probably not ideal, but there is no chance that a change to one job will affect any of the other instances. Has been working very well for us.

Topic		Replies	Views
Distinct host for batch jobs Nomad	2	39	March 10, 2025
Application deployment constraint on the basis of hostname Nomad	2	1976	October 27, 2020
Ensuring two jobs cannot run on the same node Nomad	7	2105	February 14, 2022
Possible bug: Constraint violation Nomad	1	593	January 4, 2022
Rolling deploy across hosts with exec Nomad	5	575	November 18, 2019

Rolling updates with distinct_hosts constraint

Related topics