Vault with raft-storage using AWS Auto-Scaling-Group and Auto-rejoin

Boon · May 19, 2020, 11:23am

Hi,

I’m trying to use ASG to maintain a 3 or 5 node vault cluster using raft as backend storage. ASG is useful in the sense that it will auto spin-up a new instance if one of them died for some reason. I’m also fronting the cluster with an ELB that will direct to the leader using the /sys/health check.

I’ve setup my node to auto-rejoin using the retry_rejoin option:

storage "raft" {
      path    = "${vault_storage_path}"
      node_id = "$${HOSTNAME}"
      retry_join {
        leader_api_addr = "https://${vault_elb_addr}:8200"
      }
    }

Things work really well when the nodes are rebooted, one of the standby nodes will be elected as leader and the rebooted node will start up again as standby. The ELB is also able to redirect traffic to the new leader, all good.

However, things didn’t behave as expected if one of the node is terminated, and a brand new node tries to join the cluster. This is the sequence of event I saw:

3 nodes A, B, C, A is the leader
A got terminated, B is now the leader, C remains as standby
ELB now points to the leader node B
Node D is spun up, tries to join the cluster to node B via the ELB
Node B decides to step down (don’t know why), Node B and C becomes follower
ELB has no healthy node to point to, loose re-direction to Node B
Node D looses connection and can’t join

However, if I am able to remove Node A from the peer list before Node D attempts to join the cluster (step 3a), everything works like a charm.

Anyone knows why would Node B step down? It looks like raft is tracking the node using node ID or IP. There isn’t a straight forward way to make the new node D to have the same IP as the node that has died. Is there a way to make a brand new node join successfully, and say a timeout for bad nodes to be removed from the cluster automatically?

Thanks!

assareh · May 20, 2020, 3:59pm

Hi Boon,

Thank you for posting this question.

As Vault 1.4 is the first GA release with Raft Integrated Storage, the functionality necessary to natively support running in an ASG has not yet been implemented. We are aware of it but I don’t yet have any details on when it would be implemented.

In the mean time, if you require an ASG there are a couple avenues you could consider. One is using Consul as a storage backend, see the Consul Cloud Auto-join documentation.

Another possibility would be using a script to populate the Vault config file at instance launch with discovered node addresses.

Sincerely,
Andy

shantanugadgil · July 19, 2020, 2:30pm

Hi @assareh @Boon the following is just a thought, (I use this trick elsewhere) so not sure if it would work for Vault …

How about if there are N (3 or 5) separate ASGs, each ASG with a count of 1 (one) for min, max, desired.

The user_data can have an awscli based code to “describe-intances” based on the “vault server” tag and create the config file, before the systemd based Vault service is started.

Thoughts?

EDIT: The same idea above could be used with a single ASG as well, right? Instead of joining via ELB, a small “discover by tags” script could generate config for each node, right?

assareh · July 20, 2020, 7:41pm

Hi shantanugadgil, thanks for posting! Yes, this could be doable. However the challenge you may have is cleaning up dead nodes from the raft cluster. One approach I’ve seen is using a lambda function to automatically cleans up dead nodes before new nodes come up and join the cluster.

Sincerely,
Andy

Topic		Replies	Views
Vault Raft Stuck in Standby Mode Vault	2	1737	June 12, 2020
3 server Vault cluster with Local Raft storage Vault	7	1011	March 31, 2022
Vault nodes will not unseal and join raft Vault	3	2182	October 10, 2022
Raft HA storage nodes not joining the cluster automatically Vault k8s , raft , vault	1	394	May 24, 2023
Vault Operator Raft Join not working on an existing cluster with raft storage and awskms seal config Vault	0	188	January 4, 2024

Vault with raft-storage using AWS Auto-Scaling-Group and Auto-rejoin

Related topics