How to change HA backend with 0 downtime?

spuder · February 26, 2020, 6:27pm

We are currently using ‘zookeeper’ for HA on vault clusters. I want to change the HA to ‘consul’.

There is no documentation that I can find on how to handle changing a HA backend. Has anyone done it successfully?

I have a 3 node test cluster. My plan is to do the following

Stop vault on node 2 & 3
Stop zookeeper on node 2 & 3
Change configs to consul on nodes 2 & 3
Change configs to consul on node 1
Restart node 1 <------ This is the step I’m most worried about
Vault unseal
Start vault on node 2 & 3
Vault unseal
Vault operator step down (to ensure HA works).

webmutation · February 27, 2020, 1:42pm

Hi, there are some guidelines on how to do upgrades, they should be applicable.

also

The most important thing to note is to stop first the standby servers, otherwise one of them will be elected and start being the active server.

I am not sure you will be able to have zero downtime. I would schedule a maintenance window, or create a staging environment and test the procedure there is its truly mission critical.

spuder · February 27, 2020, 5:40pm

Thanks for the links.

I do have 4 clusters so I’ll be testing on the least important first. What I’m not sure about is node 1. The instruction say:

Properly shut down the remaining (active) node. Note: it is very important that you shut the node down properly. This causes the HA lock to be released, allowing a standby node to take over with a very short delay.

If some of the nodes have the lock in zookeeper, but others have the lock in consul, won’t that cause a split brian?

The only way I see around this is I need to fully shutdown all 3 nodes, then swap the HA config, then start the cluster back up. Looking for a more elegant solution.

Topic		Replies	Views
Downtime when switching from Active to standby Vault	12	1836	July 27, 2022
Vault HA Failover using HAProxy Vault	2	3378	June 29, 2022
Vault Enterprise Upgrade Process Vault	0	137	January 23, 2024
Disable High Availability (HA) on internal storage Vault	0	100	April 22, 2024
Clustered Vault servers with postgres backend Vault	4	1987	September 23, 2019

How to change HA backend with 0 downtime?

Related topics