Nomad constantly deregister Consul services

spack971 · January 26, 2021, 2:59pm

I have set up a Nomad cluster along with a Consul instance so that the jobs can register services to connect to.

However, the services keep getting synced and deregistered. Here is what I have from the Consul logs:

    2021-01-26T14:49:59.174Z [INFO]  agent: Synced check: check=_nomad-check-dc23801467b8a65a4fd82311c2606724a180065c
    2021-01-26T14:50:00.072Z [INFO]  agent: Synced check: check=_nomad-check-1783c554d9ee0a25d52532f4178c392e931e4bb1
    2021-01-26T14:50:04.511Z [INFO]  agent: Synced service: service=_nomad-task-e8d2b77b-3bf5-96c1-8323-63b6151e2cf3-lb0-lb0-admin-admin
    2021-01-26T14:50:09.962Z [INFO]  agent: Deregistered service: service=_nomad-task-e8d2b77b-3bf5-96c1-8323-63b6151e2cf3-lb0-lb0-admin-admin
    2021-01-26T14:50:34.554Z [INFO]  agent: Synced service: service=_nomad-task-e8d2b77b-3bf5-96c1-8323-63b6151e2cf3-lb0-lb0-admin-admin
    2021-01-26T14:50:39.984Z [INFO]  agent: Deregistered service: service=_nomad-task-e8d2b77b-3bf5-96c1-8323-63b6151e2cf3-lb0-lb0-admin-admin
    2021-01-26T14:51:04.589Z [INFO]  agent: Synced service: service=_nomad-task-e8d2b77b-3bf5-96c1-8323-63b6151e2cf3-lb0-lb0-admin-admin
    2021-01-26T14:51:10.009Z [INFO]  agent: Deregistered service: service=_nomad-task-e8d2b77b-3bf5-96c1-8323-63b6151e2cf3-lb0-lb0-admin-admin

There is nothing in Nomad logs which is showing why this happens.

Any idea what could cause this issue?

Nomad v1.0.2
Consul v1.9.1

shoenig · January 27, 2021, 2:50pm

Hi @spack971, are you able to confirm whether it is Nomad actually triggering deregistrations? You should be able to see metrics for client.consul.service_deregistrations (and client.consul.service_registerations) from this client accumulating if that is the case. That consistent ~5 second gap before the de-registrations is curious - Nomad’s reconciliation loop for Consul services is every 30 seconds.

spack971 · January 28, 2021, 10:38am

Here is what I have:

[2021-01-28 11:31:50 +0100 CET][C] 'nomad.client.consul.service_deregistrations.xxx': Count: 1 Sum: 1.000 LastUpdated: 2021-01-28 11:31:56.838423778 +0100 CET m=+69783.568045089
[2021-01-28 11:31:50 +0100 CET][C] 'nomad.client.consul.service_registrations.xxx': Count: 3 Sum: 3.000 LastUpdated: 2021-01-28 11:31:57.003773987 +0100 CET m=+69783.733395294
[2021-01-28 11:32:20 +0100 CET][C] 'nomad.client.consul.service_deregistrations.xxx': Count: 1 Sum: 1.000 LastUpdated: 2021-01-28 11:32:27.048990484 +0100 CET m=+69813.778611788
[2021-01-28 11:32:20 +0100 CET][C] 'nomad.client.consul.service_registrations.xxx': Count: 3 Sum: 3.000 LastUpdated: 2021-01-28 11:32:27.235769776 +0100 CET m=+69813.965391079

spack971 · January 28, 2021, 3:02pm

Hi,

I got two nodes in my test cluster:

Node A, client and server.
Node B, client and server.

Both nodes are started with the same configuration. However when I look at the logs at TRACE level, I have the following:

Node A:

2021-01-28T15:58:55.519+0100 [DEBUG] consul.sync: sync complete: registered_services=3 deregistered_services=1 registered_checks=0 deregistered_checks=0

Node B:

2021-01-28T15:58:59.037+0100 [DEBUG] consul.sync: sync complete: registered_services=1 deregistered_services=3 registered_checks=0 deregistered_checks=0

Indeed, Node A has got 3 jobs running while Node B got 1. It seems both nodes is reverting the changes made by the other one.

Name               Address       Port  Status  Leader  Protocol  Build  Datacenter  Region
NodeA              198.51.100.1  4648  alive   false   2         1.0.2  us1         us
NodeB              198.51.100.2  4648  alive   true    2         1.0.2  us1         us

So did I miss something in my configuration? How to prevent this?

spack971 · January 29, 2021, 1:08pm

This behavior is actually displayed in the documentation. I just overlooked it:

An important requirement is that each Nomad agent talks to a unique Consul agent. Nomad agents should be configured to talk to Consul agents and not Consul servers. If you are observing flapping services, you may have multiple Nomad agents talking to the same Consul agent. As such avoid configuring Nomad to talk to Consul via DNS such as consul.service.consul

husain185 · October 19, 2023, 9:04am

hello i need help , whenever i am running job in nomad it is registering in consul but in few seconds it is getting deregister can anyone please provide solution

Material-Scientist · July 28, 2024, 1:09pm

Thank you! This actually solves the problem. Each Nomad node must have its own Consul agent set in its config file.

Topic		Replies	Views
Warning! This service has been deregistered and no longer exists in the catalog Consul	2	1292	May 30, 2024
Nomad not deregistering services from consul after they moved node Nomad consul	1	1081	March 8, 2022
Services not deregister after Nomad stop job Nomad connect	5	1422	March 13, 2023
How do I get Nomad+Consul to drop expired services? Nomad connect , consul	2	95	June 5, 2024
Nomad register service to consul with the same service id even there is no related allocations Nomad consul-nomad	1	635	January 13, 2021

Nomad constantly deregister Consul services

Related topics