It should just be checked whether a file is available or not (here as an example: /bin/tr - this file exists in busybox) or a script should be run that carries out some checks. If the file is no longer available, the job should be stopped and started on a new client. How can I implement that or where is my mistake at this point.
If I actually run the existing Nomad File, I never get to the Healthy state. I hope someone can help me or explain it. I would be very happy about help
I had a similar experience running containers, where the service never came back healthy from a script health check. In my case, it was a mongo command to see if the database was healthy.
I naively added the script to check the status, and I could see from the container logs that the script was being executed, but the service never became healthy in Consul.
I dug a bit deeper in the docs, and on the Checks page I noticed:
In Consul 0.9.0 and later, script checks are not enabled by default.
Do I need consul? Nowhere in the documentation does it say I need this! If you read the first sentence on: service Stanza - Job Specification | Nomad by HashiCorp, it says: " with the specified provider; Nomad or Consul"
So I assume that nomad itself also supports health checks. Or not???
Thanks for clarifying that you are not using Consul - I had just assumed that, since to my knowledge, as a user, that is the only mechanism to discover services.
Getting back to your problem - the script health check not coming back healthy - I don’t see anything in Nomad that would prevent that, so I’m assuming that while Nomad is actually executing the probes, it is not reporting it anywhere. However, I must admit that this doesn’t really convince me and I stand to be corrected.
You should be able to deploy Nomad without Consul, but the documentation on services states:
The service stanza instructs Nomad to register a service with the specified provider; Nomad or Consul
In your case the service provider would be Nomad. However, the service discovery part says:
Nomad schedules workloads of various types across a cluster of generic hosts. Because of this, placement is not known in advance and you will need to use service discovery to connect tasks to other services deployed across your cluster. Nomad integrates with Consul to provide service discovery and monitoring.
So this takes me back to understanding that if you want to implement probes, they need to report status to Consul in order to have service discovery.
@brucellino1 Thank you for the great explanation . I’ll add that 1.3 does have native service discovery but it doesn’t support health checks at this time.
It should just be checked whether a file is available or not (here as an example: /bin/tr - this file exists in busybox) or a script should be run that carries out some checks. If the file is no longer available, the job should be stopped and started on a new client. How can I implement that or where is my mistake at this point.
The restart stanza docs and the reschedule stanza docs have some good information on how job failure & client rescheduling works, and you can use both stanzas to customize how you’d want the failover process to work ( once you get service discovery configured )