Job Update Unhealthy Allocation Behavior

Hi all,

I’m a bit confused at the behavior Nomad takes during a job update deployment.

During a deployment, when a specific allocation gets marked unhealthy based on its Consul healthcheck, will Nomad implement the restart stanza even if a check_restart is not defined?

Or does a check_restart stanza need to be defined to achieve this behavior?

Also, if an allocation does get restarted after going unhealthy, and it becomes healthy, will the deployment see that as progress and move the progress deadline forward?

Just to update this thread.

The restart stanza will NOT be executed if there is no check_restart if depending on the health check for restarts–check_restart needs to be defined.

If an allocation is deemed unhealthy by the nomad deployment, it will stay unhealthy. The allocation needs to be rescheduled in order for the deployment to progress, which means forcing the allocation to fail with a restart attempts set to 0. The new allocation, if becoming healthy, will satisfy the deployment progression.

I’m not sure if deployments should change “unhealthy” to “healthy” if the same allocation eventually ends up becoming healthy based on restarts from the restart stanza. I would have believed it should. Should this be revisited?

Noting here for anyone who comes along later that we’ve done some follow-up on this thread in https://github.com/hashicorp/nomad/issues/6407