I have defined the following restart-strategy on group-level
restart {
interval = "10m"
attempts = 2
delay = "15s"
mode = "fail"
}
But the delay of 15 seconds is not taken into account, the job restarts immediately:
Nov 09, '22 08:42:53 +0100 | Started | Task started by client |
---|---|---|
Nov 09, '22 08:42:52 +0100 | Restarting | Task restarting in 15.587610296s |
Nov 09, '22 08:42:52 +0100 | Terminated | Exit Code: 2, Exit Message: Docker container exited with non-zero exit code: 2 |
Nov 09, '22 08:42:51 +0100 | Restart Signaled | healthcheck: check fail_service health using http endpoint ‘/health’ unhealthy |
Nov 09, '22 08:41:57 +0100 | Started | Task started by client |
Nov 09, '22 08:41:56 +0100 | Restarting | Task restarting in 16.822710794s |
Nov 09, '22 08:41:56 +0100 | Terminated | Exit Code: 2, Exit Message: Docker container exited with non-zero exit code: 2 |
Is this a bug or is there something wrong in my configuration?
Tested with Nomad 1.4.1
Here the whole job-specification
job "fail-service" {
datacenters = ["isys_poc"]
type = "service"
group "fail-service" {
count = 1
network {
port "http" {
to = 8080
}
}
task "fail-service" {
driver = "docker"
config {
image = "thobe/fail_service:v0.0.12"
ports = ["http"]
}
service {
name = "${TASK}"
port = "http"
check {
name = "fail_service health using http endpoint '/health'"
port = "http"
type = "http"
path = "/health"
method = "GET"
interval = "10s"
timeout = "2s"
}
tags = [
"traefik.enable=true",
"traefik.http.routers.fail-service.rule=Host(`fail-service.poc-nomad.intersys.internal`)",
]
}
env {
HEALTHY_FOR = -1 # Stays healthy forever
}
resources {
cpu = 100 # MHz
memory = 256 # MB
}
}
}
}