2 of 3 Consul Leaders in Cluster Deleted - Seeking Assistance for Recovery

Hi,

2 of the 3 leaders in our Consul cluster were deleted. They have been since recreated, however, their IP addresses have changed. How do we gracefully remove the stale entries of the two deleted hosts, update the cluster with the two new hosts, and force a graceful election?

Below are error logs from one of the recreated hosts. You can see it attempts to reference the older/stale hosts and IP addresses:

Reference lines…
Oct 4 15:16:43 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:43 [WARN] raft: Unable to get address for server id 08323d16-ab38-bd2b-8d7d-743d0688f409, using fallback address 10.104.3.130:8300: Could not find address for server id 08323d16-ab38-bd2b-8d7d-743d0688f409
Oct 4 15:16:43 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:43 [WARN] raft: Unable to get address for server id 64eccd68-0ee2-4396-23ca-cbfdf03b8c44, using fallback address 10.104.1.206:8300: Could not find address for server id 64eccd68-0ee2-4396-23ca-cbfdf03b8c44

Full consul logs from startup (on the non-deleted/untouched server):
Oct 04 15:10:07 CONSULSERVER2 consul[20150]: 2024/10/04 15:10:07 [ERR] agent: Coordinate update error: No cluster leader
Oct 04 15:10:23 CONSULSERVER2 consul[20150]: 2024/10/04 15:10:23 [ERR] agent: failed to sync remote state: No cluster leader

Below are the error logs from the last remaining Consul leader that was no deleted:

Oct 4 15:16:35 CONSULSERVER1 systemd[1]: Started consul agent.
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: WARNING: LAN keyring exists but -encrypt given, using keyring
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: WARNING: WAN keyring exists but -encrypt given, using keyring
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: bootstrap_expect > 0: expecting 3 servers
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: ==> Starting Consul agent…
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: ==> Consul agent running!
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: Version: ‘v1.4.4’
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: Node ID: ‘6e776848-a3d4-294f-f58e-ab33db8ae7f6’
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: Node name: ‘CONSULSERVER1’
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: Datacenter: ‘PRODDATACENTER1’ (Segment: ‘’)
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: Server: true (Bootstrap: false)
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: Client Addr: [0.0.0.0] (HTTP: 8500, HTTPS: -1, gRPC: -1, DNS: 8600)
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: Cluster Addr: 10.104.6.124 (LAN: 8301, WAN: 8302)
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: Encrypt: Gossip: true, TLS-Outgoing: false, TLS-Incoming: false
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: ==> Log data will now stream in as it occurs:
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:36 [INFO] raft: Restored from snapshot 37-38777320-1727393634958
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:36 [INFO] discover-aws: Address type is not supported. Valid values are {private_v4,public_v4,public_v6}. Falling back to ‘private_v4’
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:36 [INFO] raft: Initial configuration (index=33592248): [{Suffrage:Voter ID:6e776848-a3d4-294f-f58e-ab33db8ae7f6 Address:10.104.6.124:8300} {Suffrage:Voter ID:64eccd68-0ee2-4396-23ca-cbfdf03b8c44 Address:10.104.1.206:8300} {Suffrage:Voter ID:08323d16-ab38-bd2b-8d7d-743d0688f409 Address:10.104.3.130:8300}]
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:36 [INFO] discover-aws: Region is us-east-1
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:36 [INFO] raft: Node at 10.104.6.124:8300 [Follower] entering Follower state (Leader: “”)
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:36 [INFO] serf: EventMemberJoin: CONSULSERVER1.PRODDATACENTER1 10.104.6.124
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:36 [INFO] serf: Attempting re-join to previously known node: CONSULSERVER2.PRODDATACENTER1: 10.104.3.129:8302
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:36 [INFO] serf: EventMemberJoin: CONSULSERVER1 10.104.6.124
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:36 [INFO] agent: Started DNS server 0.0.0.0:8600 (udp)
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:36 [INFO] consul: Adding LAN server CONSULSERVER1 (Addr: tcp/10.104.6.124:8300) (DC: PRODDATACENTER1)
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:36 [WARN] agent/proxy: running as root, will not start managed proxies
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:36 [INFO] consul: Raft data found, disabling bootstrap mode
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:36 [INFO] consul: Handled member-join event for server “CONSULSERVER1.PRODDATACENTER1” in area “wan”
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:36 [INFO] agent: Started DNS server 0.0.0.0:8600 (tcp)
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:36 [INFO] agent: Started HTTP server on [::]:8500 (tcp)
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:36 [INFO] agent: started state syncer
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:36 [INFO] agent: Retry join LAN is supported for: aliyun aws azure digitalocean gce k8s os packet scaleway softlayer triton vsphere
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:36 [INFO] agent: Joining LAN cluster…
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:36 [INFO] discover-aws: Filter instances with aws:autoscaling:groupName=NJ-CONSUL-ASGRP
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:36 [INFO] serf: EventMemberJoin: CONSULSERVER2.PRODDATACENTER1 10.104.3.129
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:36 [INFO] serf: EventMemberJoin: CONSULSERVER3.PRODDATACENTER1 10.104.1.34
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:36 [WARN] memberlist: Refuting an alive message for ‘CONSULSERVER1.PRODDATACENTER1’ (10.104.6.124:8302) meta:([255 140 167 115 101 103 109 101 110 116 160 167 118 115 110 95 109 105 110 161 50 165 98 117 105 108 100 174 49 46 52 46 52 58 101 97 53 50 49 48 97 51 164 112 111 114 116 164 56 51 48 48 164 114 111 108 101 166 99 111 110 115 117 108 162 100 99 173 112 118 116 95 117 115 45 101 97 115 116 45 49 167 118 115 110 95 109 97 120 161 51 168 114 97 102 116 95 118 115 110 161 51 166 101 120 112 101 99 116 161 51 164 97 99 108 115 161 48 162 105 100 218 0 36 54 101 55 55 54 56 52 56 45 97 51 100 52 45 50 57 52 102 45 102 53 56 101 45 97 98 51 51 100 98 56 97 101 55 102 54 163 118 115 110 161 50] VS [255 140 162 100 99 173 112 118 116 95 117 115 45 101 97 115 116 45 49 167 118 115 110 95 109 97 120 161 51 168 114 97 102 116 95 118 115 110 161 51 166 101 120 112 101 99 116 161 51 164 97 99 108 115 161 48 164 114 111 108 101 166 99 111 110 115 117 108 162 105 100 218 0 36 54 101 55 55 54 56 52 56 45 97 51 100 52 45 50 57 52 102 45 102 53 56 101 45 97 98 51 51 100 98 56 97 101 55 102 54 163 118 115 110 161 50 167 118 115 110 95 109 105 110 161 50 165 98 117 105 108 100 174 49 46 52 46 52 58 101 97 53 50 49 48 97 51 164 112 111 114 116 164 56 51 48 48 167 115 101 103 109 101 110 116 160]), vsn:([1 5 2 2 5 4] VS [1 5 2 2 5 4])
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:36 [INFO] serf: Re-joined to previously known node: CONSULSERVER2.PRODDATACENTER1: 10.104.3.129:8302
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:36 [INFO] consul: Handled member-join event for server “CONSULSERVER2.PRODDATACENTER1” in area “wan”
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:36 [INFO] consul: Handled member-join event for server “CONSULSERVER3.PRODDATACENTER1” in area “wan”
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:36 [INFO] serf: EventMemberJoin: <…

Oct 4 15:16:36 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:36 [INFO] serf: EventMemberJoin: …>
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:36 [INFO] consul: Adding LAN server CONSULSERVER2 (Addr: tcp/10.104.3.129:8300) (DC: PRODDATACENTER1)
Oct 4 15:16:36 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:36 [INFO] consul: Adding LAN server CONSULSERVER3 (Addr: tcp/10.104.1.34:8300) (DC: PRODDATACENTER1)
Oct 4 15:16:39 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:39 [WARN] raft: Heartbeat timeout from “” reached, starting election
Oct 4 15:16:39 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:39 [INFO] raft: Node at 10.104.6.124:8300 [Candidate] entering Candidate state in term 134589
Oct 4 15:16:39 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:39 [WARN] raft: Unable to get address for server id 08323d16-ab38-bd2b-8d7d-743d0688f409, using fallback address 10.104.3.130:8300: Could not find address for server id 08323d16-ab38-bd2b-8d7d-743d0688f409
Oct 4 15:16:39 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:39 [WARN] raft: Unable to get address for server id 64eccd68-0ee2-4396-23ca-cbfdf03b8c44, using fallback address 10.104.1.206:8300: Could not find address for server id 64eccd68-0ee2-4396-23ca-cbfdf03b8c44
Oct 4 15:16:41 CONSULSERVER1 consul[5262]: ==> Newer Consul version available: 1.19.2 (currently running: 1.4.4)
Oct 4 15:16:43 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:43 [WARN] raft: Election timeout reached, restarting election
Oct 4 15:16:43 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:43 [INFO] raft: Node at 10.104.6.124:8300 [Candidate] entering Candidate state in term 134590
Oct 4 15:16:43 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:43 [WARN] raft: Unable to get address for server id 08323d16-ab38-bd2b-8d7d-743d0688f409, using fallback address 10.104.3.130:8300: Could not find address for server id 08323d16-ab38-bd2b-8d7d-743d0688f409
Oct 4 15:16:43 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:43 [WARN] raft: Unable to get address for server id 64eccd68-0ee2-4396-23ca-cbfdf03b8c44, using fallback address 10.104.1.206:8300: Could not find address for server id 64eccd68-0ee2-4396-23ca-cbfdf03b8c44
Oct 4 15:16:43 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:43 [ERR] agent: failed to sync remote state: No cluster leader
Oct 4 15:16:48 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:48 [WARN] raft: Election timeout reached, restarting election
Oct 4 15:16:48 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:48 [INFO] raft: Node at 10.104.6.124:8300 [Candidate] entering Candidate state in term 134591
Oct 4 15:16:48 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:48 [WARN] raft: Unable to get address for server id 08323d16-ab38-bd2b-8d7d-743d0688f409, using fallback address 10.104.3.130:8300: Could not find address for server id 08323d16-ab38-bd2b-8d7d-743d0688f409
Oct 4 15:16:48 CONSULSERVER1 consul[5262]: 2024/10/04 15:16:48 [WARN] raft: Unable to get address for server id 64eccd68-0ee2-4396-23ca-cbfdf03b8c44, using fallback address 10.104.1.206:8300: Could not find address for server id 64eccd68-0ee2-4396-23ca-cbfdf03b8c44
^C

Thank you.

Hi @JConsul,

Use the peers.json recovery mentioned in this doc Disaster recovery for Consul clusters | Consul | HashiCorp Developer

Thank you Ranjandas. I will action this recovery based on this article you provided and report back before the end of the week.

Much appreciated!

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.