Dear Nomad community,
I have a Nomad job with a task as below:
variable "datacenter" {
type = string
}
job "munge" {
priority = 95
datacenters = ["${var.datacenter}"]
type = "system"
group "munge" {
restart {
attempts = 500
delay = "2s"
mode = "fail"
}
task "munge" {
driver = "raw_exec"
user = "root"
config {
command = "systemctl"
args = ["start", "munge"]
}
}
}
}
The tasks gets executed and the service starts according to journalctl however Nomad does not seems to realize and keeps restarting
this is the issue according to Nomad logs
Sep 18 05:18:09 nid002546 nomad[80426]: 2023-09-18T05:18:09.539+0200 [INFO] client.alloc_runner.task_runner: Task event: alloc_id=15891105-6272-56fb-1b79-e102ecda7b34 task=munge type=Terminated msg="Exit Code: 0" failed=false
Sep 18 05:18:09 nid002546 nomad[80426]: 2023-09-18T05:18:09.590+0200 [DEBUG] client.driver_mgr.raw_exec.executor.stdio: received EOF, stopping recv loop: alloc_id=15891105-6272-56fb-1b79-e102ecda7b34 driver=raw_exec task_name=munge err="rpc error: code = Unavailable desc = error reading from server: EOF"
Sep 18 05:18:09 nid002546 nomad[80426]: 2023-09-18T05:18:09.592+0200 [INFO] client.driver_mgr.raw_exec.executor: plugin process exited: alloc_id=15891105-6272-56fb-1b79-e102ecda7b34 driver=raw_exec task_name=munge path=/usr/local/bin/nomad pid=91412
Sep 18 05:18:09 nid002546 nomad[80426]: 2023-09-18T05:18:09.592+0200 [DEBUG] client.driver_mgr.raw_exec.executor: plugin exited: alloc_id=15891105-6272-56fb-1b79-e102ecda7b34 driver=raw_exec task_name=munge
Sep 18 05:18:09 nid002546 nomad[80426]: 2023-09-18T05:18:09.592+0200 [INFO] client.alloc_runner.task_runner: restarting task: alloc_id=15891105-6272-56fb-1b79-e102ecda7b34 task=munge reason="Restart within policy" delay=2.470656639s
Why Nomad keeps restarting the service and what can I do so Nomad is happy?