VAULT_ADDR failover in HA

micheelengronne · September 9, 2020, 12:47pm

I setup a HA vault cluster with RAFT backend and there is something I don’t quite understand.

For a client to connect to the cluster, it must uses VAULT_ADDR.

According to the doc, it is better to avoid a LB in frontend of vault and each vault node redirects to the active node, so if VAULT_ADDR does not direct to the active node, this node will redirect the connexion to the active node. Ok, that I understand.

My question is:

What happens when the node that VAULT_ADDR references is down, I mean completely down (the server is broken, vault does not work anymore on it) ? How is the client redirected to the rest of the cluster ?

Is it possible to put multiple addresses in the VAULT_ADDR for the client to switch automatically if the first address is unreachable ?

mikegreen · September 9, 2020, 1:19pm

I’m not sure this is recommended. Where’d you find a LB is to be avoided?

jeroenjacobs79 · September 10, 2020, 7:52am

Hi,

Only one member of a Vault cluster is “active”, if the active one goes down, another one will be promoted to active. This is an important concept.

When you put an LB in front of it (like nginx), you should all cluster members in a pool, define health checks and only forward requests to the active member. You can use the /sys/health endpoint for this: https://www.vaultproject.io/api-docs/system/health

So, when the active node goes down, a new one will be elected. And the health-checks will change accordingly. At this point, the LB will send requests to the new active node.

If you are not using an LB, you should have some dns-based service discovery tool i place (Like Consul ). This will redirect a hostname like vault.service.consul to the active node.

DevOpsRob · September 10, 2020, 8:44am

Spot on @jeroenjacobs79 excellent explanation.

micheelengronne · September 10, 2020, 9:03am

Thank you. So a LB is needed then

In the doc https://www.vaultproject.io/docs/concepts/ha#behind-load-balancers the last line This can cause a redirect loop and as such is not a recommended setup when it can be avoided. can be misleading.

mikegreen · September 10, 2020, 4:50pm

@micheelengronne I think if read out of context, maybe?
That paragraph starts with if the only access to the Vault servers is via the load balancer , then you need to set the api_addr to the LB, and overall this isn’t recommended.

I don’t think you want to isolate your Vault nodes from each other, though some have req’s for that. If you end up in that case - nodes only accessible thru the LB’s URL, then you’re stuck with this possible loop (temporarily, as the LB updates health).

Topic		Replies	Views
Inquiries about api_addr, cluster_addr of the vault cluster Vault	7	2643	August 15, 2022
Info about api_addr and cluster_addr Vault	0	782	July 21, 2021
Vault - multiple active nodes in the same cluster Vault	3	301	June 13, 2024
Vault OSS HA Cluster Vault	4	794	May 12, 2023
Vault highly available setup Vault	9	1499	November 13, 2020

VAULT_ADDR failover in HA

Related topics