Hi @kschoche1 this is a K8s cluster with Consul used purely as a stand alone Service-Name database. Service discovery and registration is not integrated with K8s at all. Consul server is deployed as a StatefulSet (size 3), and the client DaemonSet is just a k8s pod on each node. Out of the box vanilla.
In the container restart scenario, the service registrations are preserved. Is this because the consul client utilizes the “data_dir” to persist the service registrations across container restarts? Obviously any pod local storage data is lost on a pod restart.
I was hopeful that some combination of command line switches to the client, would create a scenario where the remainder of the cluster would accept a previous member back, and then back fill said client with its prior service registration data.
Is that possible at all? There’s no other way to support pod delete test cases, without some sort of sidecar logic in the user’s application pods.
As a follow on question:
When the Consul client pod is deleted, are there any command line args to ask the remainder of Consul cluster to be patient before declaring all of those services to be “critical”. Something like “FailuresBeforeCritical”, but for missing DaemonSet pod? I can see how such a thing could potentially wreak havoc on quickly and deterministically identifying a consul client failure though. And honestly, I have not yet tested this scenario with a large FailuresBeforeCritical for each service, but I suspect it to have no effect in the scenario where the consul client pod is itself is deleted.