Mutual auto-unseal - transit to shamir recovery

Hello

We have cluster on premise, so no HMS/KMS available, we’d like to have mutual auto-unseal with 2 clusters.

The problem:
How one migrate back from transit to shamir, when both clusters are down?

Documentation is a bit thin when it comes to transit recovery in general.

Maybe simpler case:

Cluster A - Shamir
Cluster B - transit

Cluster A (Shamir) --> Cluster B (transit from A)

How to unseal cluster B (how to perform the recovery), when cluster A is down and cannot be started?

To be honest, I’m a bit disappointed with >Hashicorp<

The product seems to lack proper documentations - there are some examples, but you have to “discover the wheel from scratch” in many cases.
Maybe if you know the product, things are obvious, but it is definitely not intuitive.

There are no ‘best practices’ - 3-4 random articles, that you have to glue together if you’d like to run a sane environment.

EG - looking for a documentation of

  • 2 HA (3-5 nodes) clusters
  • where at least one has autounseal (preferred mutual autounseal with shamir recovery as a manual “planB”)
  • a scenario for cluster-with-autounseal transit recovery (topic of this post)
  • TLS example (and guidance with example for certificates/credentials replacement - rolling update, nodes replacement or full cluster stop?)
  • kubernetes integration with vault - best practices on certificate rotation
  • best practices on vault ACL models
  • best practices on backup/restore of your transit key
  • any description of 2 mutual-autounseal clusters (best practices, deadlock recovery, any option for a recovery with shamir? external shamir cluster?)

How do you run your clusters, if you cannot answer to that…

I feel you might be setting up for failure or frustration with your intention to use 2 clusters to mutually unseal them. This isn’t an intended use case…

Using transit to unseal another cluster expects the transit cluster is highly-available. If you lose this cluster, you won’t be able to unseal. If you are only using it for unseal, it can go down and will not seal the other cluster or do anything bad immediately. But, when your cluster leveraging its transit endpoint needs to be unsealed again, it will not work. If you lose your unsealing-transit-cluster, you’ll have to recover the cluster that depends on it, if you can’t get it back online.

What are you looking for here that isn’t documented here TCP - Listeners - Configuration | Vault | HashiCorp Developer
That’s config as well as reloading.

What is the use case? Which auth and secret engines are used?
Best practice is template as much as you can, and least priviledge access.

Do you mean recovery key? If you’re using transit, Vault manages the key for you and you just can rotate it when you want.

I don’t believe you’ll find this, as it isn’t a realistic usage of Vault to mutually unseal each other.

Maybe backuping the transit seal key regularly is a workaround ?

Rather than looking for a workaround, you should just not attempt to have clusters mutually unseal each other. That architecture is flawed.

I don’t agree with you.
This design is unconventional but isn’t flawed.
The logical pattern (entangled design) is used in many fields of engineering, from avionic engines to nuclear cores, so I think that this unusual, and complex, design should be explored and it is deployed in some scenarios.
IMHO.

Regards,
Albe