Fault tolerance in service registration?

clusby · February 13, 2020, 11:18am

Hi, a question on making device registrations and health checks more fault tolerant.

If I:

Register a service and health check on an agent (using http /agent/service/register)
Simulate a fault on the agent (ie kill -9)

Then the service and health check from that agent are failed|deregistered, even though the service itself is still running ok.

This matches my understanding based on the docs, eg:

What is the recommended approach to build in fault tolerance to our service discovery for clients?

Currently our service just registers itself on startup. Is it expected to also poll the catalog, detect it’s missing and re-register itself? Is there a better approach?

Thanks

Topic		Replies	Views
Consul node is not deregistered Consul	5	5232	February 11, 2020
When we close agent, service discovery is not working when consul agents are alive Consul	1	623	October 2, 2020
Support for Consul 1.7 features Nomad	4	400	January 8, 2020
Service cannot deregister with dead node Consul	5	118	September 4, 2024
Compute uptime of services registered in consul Consul	5	849	January 4, 2023

Fault tolerance in service registration?

Related topics