Stop and retry batch jobs that have been running for longer than an hour

For short lived batch jobs that should run periodically (using the periodic stanza) is there a way to limit the job execution time directly in Nomad?

Consider the following in bash: while :; do timelimit ./script-that-should-not-take-longer-than-an-hour.sh ; sleep 5m ; done. Timelimit will kill the script if it got stuck and then retries after 5 minutes.

The job is running in docker, so I could wrap the Docker entrypoint in a shellscript using timelimit but a Nomad-native solution is prefered.

Hi @tobiasmuehl :wave:

I don’t think there’s a native way to do this. You can prevent two of the same job from running at the same time with prohibit_overlap, but it doesn’t stop the current job, it just prevents a new one from starting.

But this could be a helpful feature, so I would suggest you to open a feature request :slightly_smiling_face: