Error taking snapshot

rwilliams-devmon · November 23, 2021, 7:18pm

Hello, I started to get the following error when running a backup script utilizing approle role-id and secret-id. This issue started recently and I am not sure what the problem is.

Error taking the snapshot: incomplete snapshot, unable to read SHA256SUMS.sealed file

when trying to take backup running command

vault operator raft snapshot save /var/lib/vault/snapshots/backup.snapshot

aram · November 24, 2021, 12:09pm

Sounds like the user you’re running the snapshot command from doesn’t have read access to the directory where vault is storing it’s data. Possibly it may not write write access to /var/lib/vault/snapshots.

ncabatoff · November 24, 2021, 1:50pm

Hi @rwilliams-devmon ,

This is a consequence of Add code to api.RaftSnapshot to detect incomplete snapshots by ncabatoff · Pull Request #12388 · hashicorp/vault · GitHub - you were probably already having these issues, but now we’re detecting them. Probably your autoseal is failing, maybe (guessing) because it’s transit and the token has expired?

rwilliams-devmon · November 24, 2021, 2:17pm

The token for the backup is generated by an approle using a role-id and secret-id, so a new temp token gets created every time we run a backup. I verified that the token is generated and working correctly.

Although our transit auto-unseal is broken currently and the token is not being renewed after 32 days because the max ttl has been met. Usually, when this happens the vault cluster fails but it has not, thankfully, which is weird.

What is the best way to set up transit auto-unseal because I keep failing and whenever this happens our entire environment loses access to vault credentials? There use to be a tutorial on Hashicorp Learn that allowed you to go through the process. It is no longer there.

rwilliams-devmon · November 24, 2021, 2:18pm

Thanks for the response, when looking at the storage location I see a new backup being stored in /var/lib/vault/snapshots.

ncabatoff · November 24, 2021, 2:52pm

If the token for the snapshot were the issue then the snapshot request would fail with a permissions error. The error you’re seeing is typically due to an autoseal issue - sounds like that’s the case here too.

Is this the tutorial you’re thinking of: Auto-unseal using Transit Secrets Engine | Vault - HashiCorp Learn ?

rwilliams-devmon · November 29, 2021, 9:31pm

Thanks for the response, I was referring to another tutorial that was on Hashicorp Learn but it looks like it is no longer available. Issuing a new auto-unseal token fixed the issue. Thanks for the help!

christopher.damerau · March 17, 2022, 10:43am

has this actually been resolved? I am getting the same error message on a test setup (5 nodes, integrated storage + external LB) it seems like out of 20 requests 4 succeed but the order appears to be random

andrew.klimovski · May 10, 2022, 4:02am

Was facing the same issue, however I’m using auto-unseal with a HA and raft.

Per the comments above I tried to re-key vault per the instructions here but ran in to another error around expired secrets.

In resolving the issues with expired secrets I’ve discovered that simply bringing down each of the nodes and allowing them to spin back up and auto-unseal has resolved the issue - snapshot backup is now working as expected. The vault operator step-down was super helpful in bringing down the master.

As far as this error goes I think Vault has some work to do. I’ve noted 3 mechanisms for creating snapshots with only 1 highlighting the issue with the snapshot and failing, and the other 2 methods generating binaries that were corrupt which I only discovered when attempting to restore.

vault cli - shows the error
vault UI - creates broken binary
curl - creates broken binary

Would hurt so bad to realise the backups you’ve been taking are corrupt when you go to restore!

ncabatoff · May 10, 2022, 12:02pm

Hi @andrew.klimovski ,

Could you file a github issue for case (2) please? I don’t know how to make things better for curl, but we should be able to improve the UI.

andrew.klimovski · May 13, 2022, 6:11am

Going through the change notes it looks like this feature was rolled in to release v1.9.0, however my vault instance is on v1.8.X and CLI tool on v1.9.2 which is likely why we weren’t seeing any issues via the UI or curl. Once we’ve upgraded if we see the issue crop up again I’ll raise a ticket.

rwilliams-devmon · May 13, 2022, 2:38pm

I was originally receiving this error due to one of the auto-unseal keys being expired.

aram · May 13, 2022, 8:34pm

Hmmm if your auto-unseal “token” expires, you’d end up with a sealed instance. That’s probably a bigger deal than your backup not running.

seanamos · May 8, 2023, 7:06pm

There is a bug with snapshot save when running it from standby nodes.
This bug is tracked here: `vault operator raft snapshot save` and `restore` fail to handle redirection to the active node · Issue #15258 · hashicorp/vault · GitHub

The workaround at the moment is to run the snapshot from the active/leader node.
If you have consul DNS setup, you achieve this by doing:

VAULT_ADDR=https://active.vault.service.${dc}.consul:8200

olinIgorov · August 18, 2023, 1:09pm

Just was about to confirm it, when i run snapshot save on standby nodes, it says exactly Error taking the snapshot: incomplete snapshot, unable to read SHA256SUMS.sealed file, but when i run it on leader, it works as it should.

david-nano · October 11, 2023, 10:09am

On the leader node I get

Error taking the snapshot: Error making API request.

URL: GET http://127.0.0.1:8200/v1/sys/storage/raft/snapshot
Code: 403. Errors:

* permission denied

but all other I get

Error taking the snapshot: incomplete snapshot, unable to read SHA256SUMS.sealed file

what is wrong here?

olinIgorov · October 12, 2023, 12:33pm

Hello

I would say, that you are using token, with wrong permissions to create snapshots on leader node regarding

Error taking the snapshot: Error making API request.

URL: GET http://127.0.0.1:8200/v1/sys/storage/raft/snapshot
Code: 403. Errors:

* permission denied

and regarding

Error taking the snapshot: incomplete snapshot, unable to read SHA256SUMS.sealed file

This seems to me, that this is default error message, when you are trying to create snapshot on stand-by node (not the leader node).

david-nano · October 12, 2023, 1:42pm

I’ve used to root token which generated after installation

Joffrey · October 16, 2023, 8:17am

Are you using de -dev mod ? (I see you use the http scheme). Perhaps it does not work with the -dev mode because it’s a in-memory storage, not raft.

olinIgorov · October 24, 2023, 2:11pm

It doesnt matter, if you use root token - if the node is not leader. The snapshot is available only on leaders.

Topic		Replies	Views
Manual raft snapshot creation fails Vault	1	50	February 6, 2025
Raft storage snapshot no longer works if transit token is changed Vault	5	1368	December 8, 2021
Raft snapshot restore issue Vault	6	1586	May 17, 2022
Error in taking raft storage snapshots Vault raft	2	2018	August 10, 2022
Error restoring Vault backup Vault	6	1127	June 13, 2022

Error taking snapshot

Related topics