Custom plugin upgrade in Kubernetes StatefulSet

Hi team,

I’m writing to ask for help in improving the custom plugin upgrade process for our Kubernetes StatefulSet running Vault.

Our current setup is as follows:

  • We have developed our own plugins for Vault.
  • We have 3 replicas of the Vault pod in the StatefulSet with “RollingUpdate” strategy.
  • When a pod starts running, it checks in its init container if it has a new plugin version and, if so, it upgrades the plugin by registering its checksum.
  • The main pod container just run Vault server.

One of the possible upgrade scenarios is as follows:

  1. A new Vault image is updated in the StatefulSet.
  2. The Vault-2 pod restarts. It was the leader pod. Now Vault-1 is selected to be the leader pod.
  3. Vault-2 finds that it’s running with a new plugin version that is different from the currently registered version.
  4. Vault-2 registers the new plugin version.
  5. Vault-2 starts running the main container with the new vault version and enters standby mode.
  6. Vault-1 restarts. Vault-0 becomes the active pod and the leader.
  7. Vault-0 cannot start running the plugin because it has the old binary that doesn’t match the new registered checksum.
  8. Vault-1 starts running the new vault version and enters standby mode.
  9. Vault-0 restarts. Vault-2 is selected to be the leader pod.
  10. Vault-2 starts running the new plugin version.

In this scenario, there is a downtime from step 4 to step 10 because the leader pod can’t serve requests to the plugin (checksums does not match). It can be up to 2 minutes. This is the worst-case scenario. Sometimes, Vault-2 is immediately selected as the leader, in which case there is almost no downtime.

I’m wondering how we can improve the worst-case scenario to decrease the downtime.
Thank you in advance

PS I found that sometimes a request to the leader pod that runs an old plugin version can succeed and sometimes the same request fails with the error message failed to run existence check (checksums did not match)
What determines whether the request succeeds or fails?

This issue came up in the Vault issue tracker before:

I believe the essence of the problem, is that the Vault plugin mechanism is fundamentally incompatible with Kubernetes. It appears to be designed for deployments on VMs, where plugins are only updated whilst stable Vault servers continue to run undisturbed.

I proposed in the above-linked issue:

I wonder if HashiCorp would be willing to have a conversation about making the checksum verification of plugins optional … the current approach doesn’t seem well suited to maintaining uptime of a cluster during upgrade in K8s?

There was a response, but I declined to take the lead on pursuing it, as I personally do not run Vault on Kubernetes.