Vault setup on kubernetes using operator & migrating data from one vault to another

I realise this post is 9 months old, but for completeness:

It is quite difficult to do.

You have to scale down the main Vault pods to zero, and manually make your own temporary pod that does not run Vault but mounts the persistent volumes you need access to.

Alternatively you do the migration to a temporary single node Raft setup that you run outside of Kubernetes, backup a Raft snapshot from that, and upload the snapshot to the snapshot restore HTTP API.

Okay, I understand. So these are the steps I followed:

  1. Create a temporary Vault Raft running in RKE2 with 1 replica (donā€™t initialize vault)
  2. Exec into the old Vault container with the storage type file
  3. Go to the /vault/ folder and create a raft folder in it
  4. Run the command vault operator migrate --config migrate.hcl
    The migrate.hcl file looks like this:
storage_source "file" {
  path = "/vault/data/"
}

storage_destination "raft" {
  path = "/vault/raft/"
  node_id = "vault-raft-0"
}

cluster_addr="https://127.0.0.1:8201"
  1. The migration is complete and it created a vault.db file into /vault/raft/ and a raft.db file into /vault/raft/raft/ including an empty folder called snapshots.
  2. Then I copied this whole /vault/raft/ folder to my local pc and copied it again to the temporary Vault Raft container. It has the same data storage mount path, so: /vault/raft/
  3. After copying the files I re-deployed the temporary Vault Raft since the pvc wonā€™t be deleted and checked if it has still the copied .db files in it.
  4. In the end I tried to unseal it, but after running the third command it returns the following message: Error unsealing: context deadline exceeded

Am I doing something completely wrong?

It looks like you did most of the steps correctly. However, perhaps itā€™s better to follow the best practices of Vault to ensure that nothing goes wrong while copying over your data from Old Cluster ā†’ PC ā†’ New Cluster. A lot of things can go wrong here.

One thing that I notice here is the fact that you are running the migration on the old Vault server. The migration should be ran on the new Vault leader. However this shouldnā€™t be an issue.

Since you are talking about a standalone server setup you can simply do the following steps:

  1. Create your new Vault server instance, setup the configuration of the server but do not initialize or startup the server yet. This server will be your Vault Raft server.
  2. Shutdown your old Vault server. This has to do with the fact that a migration should always be done offline. For example: the to-be-migrated to Vault server will also not be allowed to be started up during migration. This is a step I didnā€™t see you doing.
    However, I can understand that you might not be able to do this since you are running in containers. If possible try to mount the volume to a different container/Pod or maybe even your new Vault instance.
  3. Backup your old Vault serverā€™s data.
  4. Create your migrate.hcl migration configuration file, letā€™s take your example above.
  5. Copy over your old Vault serverā€™s data to your new Vault server in the directory mentioned in the storage_source file stanza under entry with key path. In your case this would be /vault/data.
  6. Create the directory mentioned under the path entry in the storage_destination stanza. In your case this will be /vault/raft.
  7. Run the migration using your config file vault operator migrate --config migrate.hcl.
    With this a couple of logs should show up including a few with copied key: path=. Whatā€™s going on in the background is that a Raft server is started and without decryption it will convert all of your data to the Raft data format.
  8. Now you simply startup your new Vault server and unseal using the old unseal keys. The underlying encryption has not changed.

This worked for me and otherwise I would recommend discretely sharing your log files, and server configuration so we can see if anything is out of the ordinary when starting up your new Vault server.

A side note for everyone. If you are running multiple nodes for your Raft cluster, all your other nodes need to re-join the migrated node before unsealing.

Wait, what?

Users are never supposed to create this themselves.

Hmm odd. This is what I was told some time ago to do next to the folder creation.
This was done due to ensure file permissions and such, once asking about if this is such a good idea it was mentioned that the file is overridden either way.

Firstly thanks a lot for the extensive explanation!
I donā€™t know if itā€™s possible to mount the old data volume to the new Vault instance while this new Vault instance needs to be offline. I simply donā€™t know how to shutdown the server while Vault still needs to be running in a container thatā€™s handled by a Helm chart.
I will try something with another container thatā€™s created manually with a Vault agent image. Iā€™ll keep you updated with more log files and server configuration if itā€™s still wonā€™t work. If the migration is successful I will also share my configuration and explain step by step what I have done for people who also struggle with this.

You might find this post of use, in which I explored the issue of needing a pod where Vault is not running in more detail, in response to a previous related question:

1 Like

Alright. Thanks to @RemcoBuddelmeijer and @maxb the migration was successful.
Itā€™s important to shutdown the Vault server indeed. I mounted the two PVCā€™s to a manually created pod with the Vault agent image and added the following command in the Kubernetes manifest:

          command:
          - /usr/local/bin/docker-entrypoint.sh
          - sleep
          - 999d

Thanks to this command the Vault server wonā€™t start, according to @maxb.

I think this might help https://github.com/framsouza/vault-storage-migration-on-k8s

Hi Purushotham,
We have the similar requirement and i am facing issues in migrating data from old vault to new vault. Can you suggest steps to migrate data to new vault which is setup on k8s.

Thanks