Migrating daily jobs to Nomad

Hi everyone,

We are currently evaluating Nomad for our application scheduling needs. Of most interest is the “distributed cron” feature which we have implemented with periodic jobs. Our current setup consists of both Windows and Linux operating systems with a mixture of persistant services and what I would call “daily services”- these are jobs that we would restart on a daily basis with precise downtimes with timings that may vary per application. This is currently managed in crontab (or task scheduler on Windows) where there is a start job and a stop job (typically just a kill signal).

I am trying to work out how best to migrate this style of job to Nomad. I have been using sysbatch scheduler with the periodic stanza to replicate this behaviour. This works well for starting a job but I don’t have a clear solution for stopping the job. The potential solutions I have in mind are:

  1. Create separate “start” and “stop” periodic jobs in Nomad with the stop jobs running a kill signal at a given time. The downside is that I will also need to make sure that the restart policy of the “start” job does not cause it to be restarted by Nomad

  2. Use the kill_timeout parameter in the task stanza to have the task end at a given time - we are running these kind of tasks over the course of a day so this would be quite long and doesn’t seem to be the intended use of this feature. This would also require us to always convert a fixed end time to a duration which is a potential source of error.

  3. Make changes to our applications to fit more of a “service” style without the need for daily restarts. This is feasible for some but not all of our stack as there are legacy elements

  4. Run the jobs as a service but still have a daily (periodic) stop job and use the delay parameter in the restart stanza to emulate downtime

Ideally I am looking for something like an endtime parameter where the job terminates at a specific time however this doesn’t appear to exist - I fear I am either misunderstanding something or am missing some documentation.

Does anyone have any experience with similar kind of jobs or have any advice on best approach here?

Thanks!

1 Like