Recently the following has been happening; we have an nginx job that gets it’s configuration templated by Nomad; it gets services from consul and does the right thing™. This has worked flawlessly for the past year.
Now, more often than not, newly started (or evaluated) jobs stay in Pending state for quite a while, with the following status appearing when you check the allocation’s status:
2021-03-21T12:25:04Z Template Missing: health.service(myservicename passing)
When looking at Consul directly, that service exists, and it’s health checks are in fact all passing. It seems somehow that the local consul agent is being very slow in responding to Nomad’s query, or it’s something else.
It may also be a Consul thing, but since in our regular daily use we see no problems except in combination with Nomad I figured I’d stick it in the Nomad category.
Any ideas, anyone?