Thinking about using a vault transit to unseal other vault clusters (considering all of the are running on Kubernetes), at some point there’s no way you can’t fully automate the life cycle as the vault-transit cluster will have to use SHARMIR.
Due to this issue opening, auto unseals using Cloud won’t be possible. This issue means, if you use GCPKMS and for some reason, the key is accidentally deleted or it’s unavailable, there is no way to recover the vault and you can’t restore from the backup. This is not a risk you want to take.
- Let’s assume some deleted the GCPKMS key, in case I have the same key also stored (via some custom code) in AWSKMS, to recover vault, in this case, should I adjust the main configuration and point it to the new key, or should I create a new GCPKMS key? Is it enough to fully recover vault in case the original GCPKMS is deleted?
Vault-transit cluster is being used to unseal other clusters, however, due to the issue mentioned above, we can’t use Cloud Unseal to unseal the vault-transit. Meaning we will have to use SHARMIR to unseal the vault-transit
In case you want to fully automate the vault clusters life cycle on top of Kubernetes and decrease the toil (without relying on a Cloud provider) you will have to create a custom configuration to perform the unseal on the vault-transit cluster.
- Let’s assume the node where your vault-transit pod is running has to be replaced for whatever reason. When a new pod is created, you will have to unseal the vault, you will have to gather the engineers who have the keys, and perform the unsealing. Depending on how the keys are distributed, this time dedicated to finding the engineers with the keys and unsealed vault might take a while, meaning it will directly impact the SLI of the service.
One alternative for this is to create a cronjob (or a Controller) that will automatically initiate and unseal the vault pods. My question here is the following: As long the unsealing keys are saved in a safe place (with access restriction) and also duplicated in different places for DR purposes, what are the concerns and cons for this approach?