Subject: Extending HashiCorp Vault Cluster to Multiple Regions - Questions Regarding Configuration and Disaster Recovery
We are currently running HashiCorp Vault in a single region with 3 nodes (1 leader and 2 followers), configured with the open-source version. We are planning to extend the Vault cluster to another region and have a few questions before proceeding with this activity:
Can Vault be configured to have one node in one region and the other nodes in a different region (multi-region setup)?
If yes, are there any specific configurations or prerequisites we need to consider from the Vault side to support a multi-region setup?
If a disaster occurs in the region hosting the leader node, will this affect Vault’s performance (e.g., delays or service interruptions)?
In the event of a leader failure, will Vault automatically elect a new leader, or will manual intervention be required to reassign leadership?
We’d appreciate any advice or best practices for setting up Vault in multiple regions and handling potential failures.
All of what you describe is certainly possible, but typically managed as a Vault Enterprise deployment, I would suggest. I think you’ll find HashiCorp’s multi-cluster architecture guide covers all the answers you’re looking for, but please post again if that isn’t the case!
As mentioned by jlj7, this can be done if you are running Vault Enterprise using Disaster Recovery feature.
Yes, you will need Vault Enterprise license to enable the DR replication.
Really depends on the location of your secondary cluster and latency. But if your primary cluster went down. Vault cannot promote the secondary cluster automatically. So the Vault administrator need to do it manually.
If you are talking only for your primary cluster then if one node (leader node) went down. It will assign a new leader automatically. But if you are pertaining to a secondary cluster using Disaster Recovery then as above it will require manual intevention to promote that cluster to become primary.
Something else that I think is worth pointing out, momina, is the distinct between a primary cluster and the leader of an individual cluster: if the active node or leader of a cluster goes down, and the cluster is configured for HA (high availability) then that election of a new leader will happen automatically, as part of the Raft protocol that governs the cluster. (There is some nuance here, around autopilot and quorum for starters, but I’m glossing over all that for our purposes.)
Secondly, depending on your reasons for extending Vault to another region, what you may actually want is Performance Replication (as well as DR on both clusters, in all likelihood): a DR secondary cluster will be ‘warm’, and won’t be able to serve clients until it’s promoted (in the event of a disaster, usually). However, Performance Replication can help ensure that clients is geographically disparate regions are served in a timely fashion by your Vault deployment. (There are other uses as well; again, I’m glossing over quite a bit here.)