Hi All,
I upgraded my Nomad from 1.9.7 to 1.10.2 and now my all jobs are failing with an error:
Task received by client
Sent interrupt. Waiting 5s before force killing
My Docker and Exec drivers are detected and Nomad and Consul seems to be working just fine.
I even don’t know where to start looking for potential errors.
I would like to know if there are any config changes that I need to make.
My docker version is: 28.3.0
I can’t tell you what the problem is from the information given, but the Nomad website has an upgrade guide which covers all the topics that need to be considered when upgrading between versions.
Hi @jrasell,
Thanks for your response..
I want to start by seeing errors.
All I see in 1.10 as a major change that it now requires service_identity and task_identity in consul block. I added that with no effect.
I want to start by seeing errors. Can you point me to what causes the error: “Sent interrupt. Waiting 5s before force killing”?
I think the best place to start will be looking at the client logs for one of the agent where the workload is having problems. This should have some indication of what actions and client is taking and why.
Now something has changed around getting Consul secrets I think. As it shows in my nomad.hcl, I am using node identity token generated in consul with the following policy.
Here is my template stanza in my job specification:
template {
destination = "${NOMAD_SECRETS_DIR}/envs_1.txt"
env = true
data = <<EOH
{{range ls "arch"}}
{{.Key}}={{.Value}}
{{end}}
EOH
}
The job was working earlier and now failing because it is relying on secrets from the Consul KV
My consul version is: 1.21.1 (was also upgraded from 1.20.4)
Can you please tell me if any changes needed to get the values from consul KV?
Hi @jrasell,
Thanks for your response and help.
Currently I don’t have any consul block in my job spec. All I have is a template to get values from consul KV as seen below:
template {
destination = "${NOMAD_SECRETS_DIR}/envs_1.txt"
env = true
data = <<EOH
{{range ls "arch"}}
{{.Key}}={{.Value}}
{{end}}
EOH
}
I was looking at the following:
as well as following:
But I don’t understand what changes do I need to make for workload identity and task identity.
Here is my typical consul.hcl from consul server:
OR By new standards after 1.7x, if I will have to configure workload identity, what changes do I need to make for my Consul and Nomad config as well as job spec?
I truly appreciate your help in it.. Without our Nomad jobs running, we are completey lost right now.
I would suggest taking a look through our Consul identity tutorial which includes details of all the items you’ll need to ensure are present for identities to work. In particular, configuration items such as Consul auth-methods and Consul binding-rules will be required, which you may not already have.
You will need to add a Consul block to your job specifications that need a Consul identity; in the task block a consul {} declaration should be OK looking at your setup.
@jrasell,
Finally, after your pointer to interactive tutorial, I made my cluster alive back again.
I also have gained insight into Workload identities.
That one link was a game changer.