I stopped and tried to force the join of the cluster by Vault_2 and Vault_4 but they will not joing Vault_3 the one where the recovery procedure is run.
Can you please clarify what needs to be done to return the cluster to its initial state of 3 nodes.
Cluster reset: When a node is brought up in recovery mode, it resets the list of cluster members. This means that when resuming normal operations, each node will need to rejoin the cluster.
I tried this but got the following errors back… from Vault_2 and Vault_4
Error joining the node to the raft cluster: Error making API request.
URL: POST http://127.0.0.1:8200/v1/sys/storage/raft/join
Code: 500. Errors:
* raft storage is already initialized
To follow up on this. Quorum is lost and not recovered.
According to the operations wiki, a manual recovery should be possible by creating a raft/peers.json file, however the format of this file is not described.
I was able to find it for nomad, I suppose it is the same ?
If it’s the same for Consul, too, I would suggest you are right.
For Raft protocol version 2 and earlier, this should be formatted as a JSON array containing the address and port of each Consul server in the cluster, like this:
For Raft protocol version 3 and later, this should be formatted as a JSON array containing the node ID, address:port, and suffrage information of each Consul server in the cluster, like this:
You need to clear out the old entries in the raft storage (
/vault/vault_3) once vault_3 was removed from the cluster before it can successfully re-join the cluster as a new member.
When vault_3 was removed from the cluster, it got disconnected from the leader; therefore, it no longer contains the up-to-date data. To successfully, join the cluster, Vault expects the vault_3’s raft storage to be empty, so that the leader can properly replicate the current data.