How to get alert if job stopped running?

Team,

Does nomad shoots any alert email if job goes down in client nomad. If not, then how it manages the closed jobs. Does it restart them automatically.

If there is no alert mechanism, then what is the best possible pattern we can follow to receive some email in such cases or at least in the case when whole client node is down.

Please help.

Thank you
Tanul

Hi @smartaquarius10,

In the event of a node being lost, the allocations that were running on the node will be re-scheduled on other nodes in the cluster which can accommodate them.

Nomad does not include native methods to trigger alerts of this kind and is expected to be fulfilled by an external process such as Grafana notifications utilising Nomad telemetry. With Nomad telemetry available in Prometheus and Grafana you could then setup alerts on metrics such as nomad_nomad_job_summary_running{job=:JobID} to indicate an insufficient number of running allocations.

Thanks,
jrasell and the Nomad team

1 Like

Thanks for the info. But one query I have a situation where i can run application on. A specific node only. If that node is down then I dont want that application to be transferred to another node.

Is there anyway to restrict a job running on one specific node only.