Why doesn't eval -force-reschedule ... force-reschedule failed tasks

I’ve seen a lots of situations where a job has some failed task (my last example : an app tried to acquire a liquibase lock on a DB table, but the lock was held by another app, so the task failed). Now, re-submitting the job with no modification does nothing (this is expected). But the command

nomad job eval -force-reschedule my-job

Also does nothing. This is not expected. From the doc, this should force reschedule of failed allocation. I haven’t defined any reschedule in my job file, which should mean it’s using the default for services, according to reschedule Block - Job Specification | Nomad | HashiCorp Developer

Now, the only ways I found to really force reschedule of failed alloc is to either stop -purge the job and re-submit it, or scale the affected task group with

nomad job scale my-job my-group <number>

Am I the only one having problem with this ?

No you’re not the only one, in fact this is the single most annoying thing in nomad. You cannot just tell it to try again no matter what you do. The best things you can try:

  • nomad stop -purge
  • nomad job eval -force-reschedule
  • change job description, like adding a dummy tag, and then nomad run

I haven’t heard of the scale trick, will try this out thank you