Backup and Restore Vault cluster with consul storage backend

I’ve some questions on backup and restore Vault cluster backend, the storage backend is a consul cluster.

My plan is to save a vault data backup daily with a cron, then if someone or application delete some vault entries, we will use the daily backups to restore.

Since the data storage backend is consul cluster, I’m using ‘consul snapshot save <daily_snapshot_file>’ command to do the real work.

I’ve read the doc at Vault Data Backup Standard Procedure | Vault - HashiCorp Learn, Though, confused on the preparation list, especially this sentence ‘Bring your Vault and Consul clusters back online following the circumstances that required you to restore from backup’

I don’t have much experience on vault/consul. So to play it safe, before running the above data backup command with consul, I seal the every member of the vault cluster.

This works fine, but during the time period when vault cluster is sealed, vault clients fails to get/put data to vault. Some time when there are a few tens of GB data in the vault, the backup time can be pretty long, and so vault clients are adversely affected.

Is it safe to run ‘consul snapshot save …’ command when the vault cluster is unsealed? If not, is there any alternatives, say toggle some options/switches to partially locking vault secret paths, similar to MySQL DB/Table/Rows locking?

Currently, in fear of vault data corruption, My backup procedure is:

1, seal all server members of vault cluster
2, on the first vault server member, run ‘consul snapshot save <snapshot_file>’
3, unseal all server members of vault cluster
4, encrypt the snapshot and save it to offsite.

Restore doesn’t happen often, but the steps are very similar to the above.

Please advice, Thanks,

There is no need to seal Vault. The snapshot function only contained verified writes, so you can’t be out of sync. Although not required, we only run the backup job on the consul leader node, but that’s just so that we know where the backup was done from at any point.

Restore is a difference process obviously, you can’t have Vault up and running at all when doing a consul restore. Shutdown vault complete, do your restore, then start Vault (only 1 node) let it tell you if there is an issue, if not then start the other nodes.

Hi @Aram,

Thanks for quick reply, they are really very helpful,

For the snapshot backup part, when the snapshot saving process taking long time, say 3~5 minutes, there could be some vault write/update operations happen during the time period. Do you think updates could be backed up partially? I’m wondering how’s the mechanism that guarantees no partial vault operation results show up in consul snapshots, Thanks,

For the restore part, shutting down vault completely is definitely a save/preventive step. Currently my vault cluster is sealed completely and I test normally vault get/put/list operations and they are all blocked due to seal. But sure, shutting down is definitely even more safer, and doesn’t introduce any harm.

Best,

The snapshot is a point in time, when you start the backup (or when consul acknowledges the start of the snapshot), any updates after that point in time are not included in the backup.

Thanks for confirmation, I’ll then just go ahead to run consul snapshot save directly, without sealing of vault cluster VMs.