We have a 3-node cluster running consul and vault. I’ve got a script that runs every hour to take a snapshot of the consul data for disaster recovery.
Recently, the “consul snapshot save” command has been failing on an irregular basis. I upgraded consul at the weekend to version 1.8.6 and ensured that the script is communicating with consul running on the same host as itself, but we’re still getting errors like this:
Error verifying snapshot file: failed to read snapshot file: failed to read or write snapshot data: unexpected EOF
Does it matter that I’m trying to use “consul snapshot save” on a follower rather than the leader?
Is there anything else I should be checking to stop errors like this happening?
I’ve since discovered that our use of Vault has been resulting in a lot of AWS IAM auth tokens being generated but not revoked when finished with. This has caused the snapshots to grow to this size. Cleaning up the tokens has reduced the backup size down to single-digit MB files.
Are you able to restore the snapshot to the newly enabled engine.
I am not able to restore the secrets from the snapshot to the newly enabled vault secret kv/ backend.
If i backup and restore to the same secret engine with the same accessor id, it works but it doesn’t work when everything is created from scratch and you are using the previously taken snapshot to restore secrets to this new vault consul stack (even if path and secret engine you are restoring to is same)