I have set up a Nomad cluster along with a Consul instance so that the jobs can register services to connect to.
However, the services keep getting synced and deregistered. Here is what I have from the Consul logs:
2021-01-26T14:49:59.174Z [INFO] agent: Synced check: check=_nomad-check-dc23801467b8a65a4fd82311c2606724a180065c
2021-01-26T14:50:00.072Z [INFO] agent: Synced check: check=_nomad-check-1783c554d9ee0a25d52532f4178c392e931e4bb1
2021-01-26T14:50:04.511Z [INFO] agent: Synced service: service=_nomad-task-e8d2b77b-3bf5-96c1-8323-63b6151e2cf3-lb0-lb0-admin-admin
2021-01-26T14:50:09.962Z [INFO] agent: Deregistered service: service=_nomad-task-e8d2b77b-3bf5-96c1-8323-63b6151e2cf3-lb0-lb0-admin-admin
2021-01-26T14:50:34.554Z [INFO] agent: Synced service: service=_nomad-task-e8d2b77b-3bf5-96c1-8323-63b6151e2cf3-lb0-lb0-admin-admin
2021-01-26T14:50:39.984Z [INFO] agent: Deregistered service: service=_nomad-task-e8d2b77b-3bf5-96c1-8323-63b6151e2cf3-lb0-lb0-admin-admin
2021-01-26T14:51:04.589Z [INFO] agent: Synced service: service=_nomad-task-e8d2b77b-3bf5-96c1-8323-63b6151e2cf3-lb0-lb0-admin-admin
2021-01-26T14:51:10.009Z [INFO] agent: Deregistered service: service=_nomad-task-e8d2b77b-3bf5-96c1-8323-63b6151e2cf3-lb0-lb0-admin-admin
There is nothing in Nomad logs which is showing why this happens.
Any idea what could cause this issue?
Nomad v1.0.2
Consul v1.9.1
Hi @spack971, are you able to confirm whether it is Nomad actually triggering deregistrations? You should be able to see metrics for client.consul.service_deregistrations
(and client.consul.service_registerations
) from this client accumulating if that is the case. That consistent ~5 second gap before the de-registrations is curious - Nomad’s reconciliation loop for Consul services is every 30 seconds.
1 Like
Here is what I have:
[2021-01-28 11:31:50 +0100 CET][C] 'nomad.client.consul.service_deregistrations.xxx': Count: 1 Sum: 1.000 LastUpdated: 2021-01-28 11:31:56.838423778 +0100 CET m=+69783.568045089
[2021-01-28 11:31:50 +0100 CET][C] 'nomad.client.consul.service_registrations.xxx': Count: 3 Sum: 3.000 LastUpdated: 2021-01-28 11:31:57.003773987 +0100 CET m=+69783.733395294
[2021-01-28 11:32:20 +0100 CET][C] 'nomad.client.consul.service_deregistrations.xxx': Count: 1 Sum: 1.000 LastUpdated: 2021-01-28 11:32:27.048990484 +0100 CET m=+69813.778611788
[2021-01-28 11:32:20 +0100 CET][C] 'nomad.client.consul.service_registrations.xxx': Count: 3 Sum: 3.000 LastUpdated: 2021-01-28 11:32:27.235769776 +0100 CET m=+69813.965391079
Hi,
I got two nodes in my test cluster:
- Node A, client and server.
- Node B, client and server.
Both nodes are started with the same configuration. However when I look at the logs at TRACE level, I have the following:
Node A:
2021-01-28T15:58:55.519+0100 [DEBUG] consul.sync: sync complete: registered_services=3 deregistered_services=1 registered_checks=0 deregistered_checks=0
Node B:
2021-01-28T15:58:59.037+0100 [DEBUG] consul.sync: sync complete: registered_services=1 deregistered_services=3 registered_checks=0 deregistered_checks=0
Indeed, Node A has got 3 jobs running while Node B got 1. It seems both nodes is reverting the changes made by the other one.
Name Address Port Status Leader Protocol Build Datacenter Region
NodeA 198.51.100.1 4648 alive false 2 1.0.2 us1 us
NodeB 198.51.100.2 4648 alive true 2 1.0.2 us1 us
So did I miss something in my configuration? How to prevent this?
1 Like
This behavior is actually displayed in the documentation. I just overlooked it:
An important requirement is that each Nomad agent talks to a unique Consul agent. Nomad agents should be configured to talk to Consul agents and not Consul servers. If you are observing flapping services, you may have multiple Nomad agents talking to the same Consul agent. As such avoid configuring Nomad to talk to Consul via DNS such as consul.service.consul
7 Likes
hello i need help , whenever i am running job in nomad it is registering in consul but in few seconds it is getting deregister can anyone please provide solution
Thank you! This actually solves the problem. Each Nomad node must have its own Consul agent set in its config file.