Hi,
I’m trying to use ASG to maintain a 3 or 5 node vault cluster using raft as backend storage. ASG is useful in the sense that it will auto spin-up a new instance if one of them died for some reason. I’m also fronting the cluster with an ELB that will direct to the leader using the /sys/health check.
I’ve setup my node to auto-rejoin using the retry_rejoin option:
storage "raft" {
path = "${vault_storage_path}"
node_id = "$${HOSTNAME}"
retry_join {
leader_api_addr = "https://${vault_elb_addr}:8200"
}
}
Things work really well when the nodes are rebooted, one of the standby nodes will be elected as leader and the rebooted node will start up again as standby. The ELB is also able to redirect traffic to the new leader, all good.
However, things didn’t behave as expected if one of the node is terminated, and a brand new node tries to join the cluster. This is the sequence of event I saw:
- 3 nodes A, B, C, A is the leader
- A got terminated, B is now the leader, C remains as standby
- ELB now points to the leader node B
- Node D is spun up, tries to join the cluster to node B via the ELB
- Node B decides to step down (don’t know why), Node B and C becomes follower
- ELB has no healthy node to point to, loose re-direction to Node B
- Node D looses connection and can’t join
However, if I am able to remove Node A from the peer list before Node D attempts to join the cluster (step 3a), everything works like a charm.
Anyone knows why would Node B step down? It looks like raft is tracking the node using node ID or IP. There isn’t a straight forward way to make the new node D to have the same IP as the node that has died. Is there a way to make a brand new node join successfully, and say a timeout for bad nodes to be removed from the cluster automatically?
Thanks!