Failed to decrypt encrypted stored keys after consul snapshot restore

Hi,

So basically, I have set up an architecture with Consul/Vault in a kubernetes cluster within AWS. My vault auto unseals with AWS KMS when the pods start.

Recently I’ve done some testing around backing up vault using consul snapshot.

The scenario I tested is:

  1. First taking snapshot of vault consul snapshot save vault.prod.snap
  2. Then removing vault doing consul kv delete -recurse vault/
  3. Removing vault statefulsets and pods
  4. consul snapshot restore vault.prod.snap
  5. Finally re-create vault statefulsets

Result:

I got an error 500 on the third key during the auto unseal that says:
body {“errors”:[“failed to decrypt encrypted stored keys: cipher: message authentication failed”]}

It turns out after a consul snapshot restore my unsealed keys are not valid anymore.

I tried that another test where I don’t clean the vault with command kv delete -recurse vault/
I basically just remove a couple of policies in the UI and the restore. That scenario seems to work correctly, it’s only when I restore from “scratch”, that my vault cannot unseal anymore.

could somebody give me some hint please ?

I’ve finally fixed the problem

After fixing the followings my vault was able to be unsealed again:

  1. Duplicate Node IDs :https://github.com/hashicorp/consul/issues/4741
  2. Consul server Has to be given ACL token with sufficient rights

For the first issue, I’ve reused a ruby script and added it to the consul docker image to generate a permanent node id for each consul.

Regarding the second issue, my ACL policy for the agent was not giving required permission for the agent to make necessary updates.