We are using Vault in an EKS cluster in AWS. It is configured with S3 as backend and we have a KMS key for encryption in this bucket.
Today something triggered a re-deployment of the vault service in the cluster, and this caused the files in the core folder to be deleted with a force-destroy (So previous versions and delete markers are gone too) and replaced with new files.
We still have the root key, the kms key and all of the recovery keys, but we haven’t found a way to unseal the vault.
This is the log we get on loop in the pod running the service:
2023-03-31T04:09:12.303Z [INFO] core.autoseal: seal configuration missing, but cannot check old path as core is sealed: seal_type=recovery
2023-03-31T04:09:14.308Z [WARN] core: stored keys supported on init, forcing shares/threshold to 1
2023-03-31T04:09:14.334Z [INFO] core: security barrier not initialized
2023-03-31T04:09:14.441Z [INFO] core: security barrier initialized: stored=1 shares=1 threshold=1
2023-03-31T04:09:14.613Z [INFO] core: post-unseal setup starting
2023-03-31T04:09:14.717Z [INFO] core: loaded wrapping token key
2023-03-31T04:09:14.717Z [INFO] core: successfully setup plugin catalog: plugin-directory=
2023-03-31T04:09:14.740Z [INFO] core: no mounts; adding default mount table
2023-03-31T04:09:14.813Z [INFO] core: successfully mounted backend: type=cubbyhole path=cubbyhole/
2023-03-31T04:09:14.813Z [INFO] core: successfully mounted backend: type=system path=sys/
2023-03-31T04:09:14.814Z [INFO] core: successfully mounted backend: type=identity path=identity/
2023-03-31T04:09:14.895Z [INFO] core: pre-seal teardown starting
2023-03-31T04:09:14.895Z [INFO] core: pre-seal teardown complete
2023-03-31T04:09:14.895Z [ERROR] core: post-unseal setup failed during init: error="error fetching default policy from store: failed to read policy: decryption failed: cipher: message authentication failed"
2023-03-31T04:09:15.368Z [INFO] core: stored unseal keys supported, attempting fetch
2023-03-31T04:09:15.466Z [INFO] core.cluster-listener.tcp: starting listener: listener_address=[::]:8201
2023-03-31T04:09:15.466Z [INFO] core.cluster-listener: serving cluster requests: cluster_listen_address=[::]:8201
2023-03-31T04:09:15.488Z [INFO] core: post-unseal setup starting
2023-03-31T04:09:15.556Z [INFO] core: loaded wrapping token key
2023-03-31T04:09:15.556Z [INFO] core: successfully setup plugin catalog: plugin-directory=
2023-03-31T04:09:15.604Z [INFO] core: successfully mounted backend: type=system path=sys/
2023-03-31T04:09:15.604Z [INFO] core: successfully mounted backend: type=identity path=identity/
2023-03-31T04:09:15.604Z [INFO] core: successfully mounted backend: type=cubbyhole path=cubbyhole/
2023-03-31T04:09:15.641Z [INFO] core: pre-seal teardown starting
2023-03-31T04:09:15.641Z [INFO] core: pre-seal teardown complete
2023-03-31T04:09:15.641Z [ERROR] core: post-unseal setup failed: error="error fetching default policy from store: failed to read policy: decryption failed: cipher: message authentication failed"
2023-03-31T04:09:15.641Z [WARN] core: vault is sealed
2023-03-31T04:09:15.641Z [WARN] failed to unseal core: error="unseal with stored key failed: error fetching default policy from store: failed to read policy: decryption failed: cipher: message authentication failed"
2023-03-31T04:09:17.282Z [INFO] core.autoseal: seal configuration missing, but cannot check old path as core is sealed: seal_type=recovery
We have tried a re-key, but we get the following error:
$ vault operator rekey -target=recovery -init -key-shares=5 -key-threshold=3
Error initializing rekey: Error making API request.
URL: PUT http://127.0.0.1:8200/v1/sys/rekey-recovery-key/init
Code: 503. Errors:
* node is not active
We tried creating a new config file manually to attempt an unseal, but we get that no recovery key was found:
/ $ vault operator unseal
Unseal Key (will be hidden):
Key Value
--- -----
Recovery Seal Type shamir
Initialized true
Sealed true
Total Recovery Shares 5
Threshold 3
Unseal Progress 2/3
Unseal Nonce b*****d
Version 1.6.2
Storage Type s3
HA Enabled false
/ $ vault operator unseal
Unseal Key (will be hidden):
Error unsealing: Error making API request.
URL: PUT http://127.0.0.1:8200/v1/sys/unseal
Code: 500. Errors:
* no recovery key found
Otherwise the vault status is like this:
/ $ vault status
Key Value
--- -----
Recovery Seal Type awskms
Initialized true
Sealed true
Total Recovery Shares 0
Threshold 0
Unseal Progress 0/0
Unseal Nonce n/a
Version 1.6.2
Storage Type s3
HA Enabled false
We tried starting the service in recovery mode, but then we get 404 in response to everything we do:
$ vault status
Error checking seal status: Error making API request.
URL: GET http://127.0.0.1:8200/v1/sys/seal-status
Code: 404. Raw Message:
404 page not found
At this point we are out of ideas, so any and every bit of help or advice will be more than welcome.