I have not done this before, but your question is interesting to me, as I think I might need to do something similar in the future…
As you have noticed, whilst Helm & Kubernetes make the initial deployment quite easy, complex operations thereafter can be made harder…
First, I tried helm upgrade
from a Consul-storage deployment to a Raft-storage deployment, just to see what would happen…
Error: UPGRADE FAILED: cannot patch "vault" with kind StatefulSet: StatefulSet.apps "vault" is invalid: spec: Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', 'updateStrategy', 'persistentVolumeClaimRetentionPolicy' and 'minReadySeconds' are forbidden
It seems the problem is the Helm chart wants to change volumeClaimTemplates
so that PVCs are generated for the Vault StatefulSet … which makes sense … but also isn’t supported by Kubernetes.
Oh well… Vault storage migration requires downtime anyway, so the fact we have to delete the StatefulSet to replace it isn’t really making things worse.
Where things start getting trickier, is that we need somewhere to run the storage migration… meaning we need somewhere with access to the mounted Persistent Volume, whilst Vault is not running…
Trying to put all these contraints together, I came up with the following rough draft of a migration plan…
But before that - in the middle of the procedure, we’re going to need a way to actually run the migration, and when we do, we need:
- Access to the Vault CLI binary
- Access to the data volume
- Vault server to NOT be running
I can’t see any way to make that happen using the existing server pods, since if you kill the server process, the pod will terminate.
That means we need to make our own “maintenance” pod definition, and if we’re using Helm anyway, we might as well create the “maintenance” pod using it too.
So… make sure you’re using a local copy of the Vault Helm chart so you can easily make modifications, and copy the templates/server-statefulset.yaml
file to set up a new statefulset that will define our optional “maintenance” pod:
- The
metadata.name
will need to be different to distinguish it, as will the spec.serviceName
(add a suffix -maint
?)
-
component: server
will need to change to something like component: maintenance
to set it apart (both cases)
- Various other optional parts of the YAML might be applicable only to the running servers and not a maintenance pod, depending on what you have configured
- The
readinessProbe
, livenessProbe
, lifecycle
, and template rendering the volumeClaimTemplates
are not wanted for a maintenance pod
- But we need to add in an explicit mention of the volume we want to mount instead to the
volumes
section:
- name: data
persistentVolumeClaim:
claimName: data-vault-0
- As well as deleting the
args
and changing the command
so we run a dummy command instead of starting a real Vault server:
command:
- /usr/local/bin/docker-entrypoint.sh
- sleep
- 999d
- And we’ll set
spec.replicas
to 0
so that we only have a maintenance pod when we manually scale up this StatefulSet.
With all of that prepared …
- Schedule planned downtime in advance
- Scale the Vault StatefulSet to zero replicas (the Vault service is now offline)
- Consider taking a backup just to be safe… although we’ll be leaving the old Consul pods in existence so in a way they are a backup themselves.
- Manually
kubectl delete
the StatefulSet, since we need to replace it
-
helm upgrade
the Vault chart to values that specify Raft storage
- Scale the new Vault StatefulSet to zero replicas, because once it has initialised the volumes, we need Vault not running to do the storage migration
- Scale the maintenance StatefulSet to 1
-
kubectl exec -it podname -- sh
into the maintenance pod
- In the maintenance pod interactive session, create a configuration file for
vault operator migrate
, and run the migration… but maybe before you start, wipe the initial contents of /vault/data/
created when the server pod initially started up and created a new initial database?
- Scale the maintenance StatefulSet to 0, and the main StatefulSet back to your desired number of replicas
- Depending on the details of your configuration, it’s possible all your replicas find each other and replicate the migrated data to the other nodes, or perhaps some executions of
vault operator raft join
are needed.
Not at all tested in full! I was pretty far into “thought experiment” territory by the end of typing all that. But, hopefully it’s a decent source of inspiration if you want to work through productionising something based on this.