How to not killed all task of an allocation when one task is failing

In our use case, we do not want to auto restart failing tasks on failure, so we set

restart {
  attempts = 0
  delay    = "15s"
  interval = "24h"
  mode     = "fail"
}

However, when a task is failing it seems that nomad is killing all the running tasks belonging to the same allocation.

Is there a way to prevent that ?

Thanks.

Hi @scyd and thanks for asking this question. Nomad treats an allocation as an immutable object, therefore a single task within a group failing, as you state, results in the remaining tasks being marked as failed. There is no way around this behaviour. If the tasks within the group are as independent as you detail, it might be advisable to run some (or all) in separate groups, or even jobs.

Thanks,
jrasell and the Nomad team

1 Like