How to monitor nomad jobs history?

cdwiegand · December 23, 2020, 1:54am

We’re having some real difficulties with nomad. Our service jobs seem fine, and seem to self-heal, but periodic jobs? Seems really difficult to determine if they actually ran after the fact. We have a number of jobs that run overnight, and sometimes they’ve run two simultaneously (even with prohibit_overlap true), we have some that don’t show any allocations, and some that run every other day (even when the task ends after ~ 1 hour…). What do other people do to monitor their periodic jobs? I can’t control all of the jobs themselves, so changing the jobs to call a webhook when they complete doesn’t work. We do have centralized logging, but I’m not going to dig through that every day for every job to see if it thinks it ran successfully (and some of those jobs don’t have in the way of logging anyways). Do we need to just dump the current status every few minutes and write a progrma to analyse and determine if a job didn’t run on time? Is there a way to do this in the GUI I’m not seeing? Is there historical job-run log I don’t see? Even an API or CLI would do - I can wrap it into a program to call (via system crontab!) to get the info out of Nomad.

ravi · January 3, 2022, 6:39pm

I’m also looking into this as of recent any insight would be helpful to start my investigation.

Kamilcuk · December 9, 2024, 10:46pm

Hi. I posted Nomad Job Launches UI empty / no history of periodic jobs - #2 by Kamilcuk about it.

Basically execute nomad operator api /v1/event/stream > file in the background and then parse the file with some python script. Such event stream will give you all the information, and if some information is missing, you can detect that situation and send an alert. For an example stack, Grafana Loki has a absent_over_time operator and support for json parsing and Grafana can send alerts based on Grafana Loki.

Topic		Replies	Views
Nomad Job Launches UI empty / no history of periodic jobs Nomad	2	69	September 11, 2024
Nomad logs and allocations debugging Nomad	1	361	July 4, 2023
Periodic Stanza logs Nomad	1	350	November 30, 2022
Nomad periodic job metrics Nomad	2	993	November 27, 2020
Reviewing job history of already completed jobs Nomad	1	457	January 5, 2022

How to monitor nomad jobs history?

Related topics