Raft storage snapshot no longer works if transit token is changed

In my current deployment, on premise, I deploy a transit vault, and a 3-5 node vault raft integrated storage cluster using transit auto-unseal in a kubernetes cluster. A month ago I was able to take a nightly raft snapshot, and I could deploy an entire new instance of the transit vault instance, with a new token, and a fresh set of cluster nodes, and then restore the cluster from a snapshot from the previous instance, and this worked. Today, It stopped working, and I get the error:

vault operator raft snapshot restore vault_snapshots/2021-12-07_19:49:59.53da54db1421d11568e2c28618f5813189ca6c6a483f025d1e2d8efbdd5679a9
Error installing the snapshot: Error making API request.

URL: POST https://oasis-vault.ocp.dhe.duke.edu/v1/sys/storage/raft/snapshot
Code: 400. Errors:

  • could not verify hash file, possibly the snapshot is using a different autoseal key; use the snapshot-force API to bypass this check

If I -force, the vault seals and does not work at all. I am very new to managing Vault, so if there is something that I need to do to restore an older vault cluster to a newly deployed transit in order to get older snapshots to work, please let me know.

1 Like

When you migrate to a new transit auto-unseal you’re rotating your keys. Any backups/snapshots from prior to that point become invalidated – this is specially true if you were using auto-unseal, which means you don’t have your own keys in hand.

One thing you can do is implement using BYOK which is a little bit of a anti-pattern to “no one has the key, but only shards of the key” but that’s the solution to something like this.

Am I missing something? This does not make sense. How do you backup and restore a vault cluster that uses transit auto unseal? There does not appear to be a way to deploy an update to the transit vault instance (such as a version increment), without invalidating the snapshots that have been saved for the entire cluster using the previously deployed transit instance. That seems like a big regression in functionality compared to just 1 month ago, when I am sure this worked.

If I wasn’t clear, my answer is dependent on you moving your key, or rekeying the instance. There is no regression this is how it has been since they introduced auto-unseal.

can you please point me to the right documentation for re-keying an instance? Thanks

google … hashicorp vault rekey