Clarification on `consul lock` behaviours and rotation of Consul Server nodes

jbeemster · July 21, 2021, 9:36am

Hi everyone,

We have recently started leveraging the consul lock command to ensure that we have singleton behavior for certain jobs we are running in our Nomad Cluster. This has been working really well to ensure exactly one execution is running at any given time.

But we ran into an issue when we updated our Consul Server nodes in that the jobs that had an active lock session were suddenly dropped which also aggressively terminated the processes that were locked - the Consul cluster was available throughout the rolling upgrade process.

The hunch we have is that due to us using a Load Balanced endpoint the consul lock process could not resolve the new Consul Server nodes correctly (IP caching?) and therefore could not re-establish the link.
This would therefore be resolved by leveraging a local consul agent instead which would handle connecting to the new Consul nodes automatically.

So I am searching for clarification on two points here:

Is it possible for a consul lock session to be moved to a new Consul Server node during a rolling update?

If yes what sort of timeouts are recommended to ensure enough time is given to prevent the process being killed?

Is my understanding of what caused the issue correct? We have already migrated to using Consul Agents instead of the load balanced endpoint but want to ensure the next rolling update won’t break my jobs!

I am going to be running my own tests here to check what does happen in both cases in a test rig but as I could not find the answers in the documentation wanted to pose the question anyway!

Consul Version: 1.9.7

Topic		Replies	Views
Is there a way to make `consul lock` skip critical nodes? Consul	3	425	September 12, 2024
New server with the same persistent volume can't join cluster Consul	11	5330	April 21, 2021
Distributed locking with "virtual" node Consul	1	196	March 14, 2024
Consul server going out of cluster Consul	0	406	September 22, 2020
Synchronization when updating KV store Consul kv	10	2853	April 11, 2020

Clarification on `consul lock` behaviours and rotation of Consul Server nodes

Related topics