Any configuration example on DR?

Is there any config example regarding DR on vault? Actual code will be much appreciated.

As an update I alredy read the hashi docs, but I want to know the best practice to automate the code like a script or terraform provider code.

1 Like

// , I’m sure many people would disagree with me, but I think HashiCorp Vault DR is ripe for automation.

What’s your estimated SLO? And does it allow for 15-60 minutes of downtime in the event of an outage?

If so, you can probably just have manual failover.

But if not, I have some suggestions and code.

First, don’t automate failover triggering.

Vault does not support an automatic failover/promotion of a DR secondary cluster, and this is a deliberate choice due to the difficulty in accurately evaluating why a failover should or shouldn’t happen. For example, imagine a DR secondary loses its connection to the primary. Is it because the primary is down, or is it because networking between the two has failed?

If the DR secondary promotes itself and clients start connecting to it, you now have two active clusters whose data sets will immediately start diverging. There’s no way to understand simply from one perspective or the other which one of them is right.

It’s bad joss.

Here’s some example code straight from the HashiHorse’s mouth:

For partially automating this, I think it’s better to get the DR Operation token as part of the setup, and stash it somewhere reeeeeeeeeal safe for later use in a break-glass situation.

I also wrote a few crappy bash scripts for getting the DR Operation Token, and for using it later:

Bug reports and PRs welcome!

2 Likes

I’ll take a look to the code, but I don’t want to substitute the manual failover, as you said Hashi recommends to keep it that way, but I would like to automate the configuration of the DR, like with a script or tf provider code. I would like to know if such an example exists.

1 Like

// , I think if you wanted to do this in Terraform, you’d need to use a Null Resource.

IDK if DR config’s a good fit for Terraform, honestly.

How’d it go?

Did you get any aspects of this automated or make any runbooks?

This went well but now, I’m having a wrapping issue, even applying the same certificates among clusters in different regions.