Kubernetes authentication method fails after follower becomes a leader in vault HA cluster

I trust this message finds you well. Recently, we encountered an unexpected issue during the automatic switchover of our Vault HA cluster, where a follower assumed the role of the leader. Subsequently, our applications faced challenges fetching secrets, resulting in a “permission denied” issue.

Upon investigation, it was identified that Kubernetes authentication for service account(s) failed. To address this, the following authentication command was executed to reconfigure Kubernetes authentication with the Vault server:

/ vault write auth/kubernetes/config \ token_reviewer_jwt="(cat /var/run/secrets/kubernetes.io/serviceaccount/token)"
kubernetes_host=https://${KUBERNETES_PORT_443_TCP_ADDR}:443
kubernetes_ca_cert=@/var/run/secrets/kubernetes.io/serviceaccount/ca.crt
Success! Data written to: auth/kubernetes/config

While this resolved the immediate problem, we are concerned about the need to manually reconfigure Kubernetes authentication after every switch in the Vault cluster. This has become critical for us, as production services are being disrupted due to this issue.

We seek your assistance in ensuring that Kubernetes authentication is automatically configured with the Vault server whenever a switch occurs. Any guidance or support you can provide to address this matter would be highly appreciated.

Here are some additional details about our environment:

Vault Cluster Information:
Vault (v1.15.2) - 3 nodes HA cluster with Raft backend on Kubernetes.

vault-agent-init Log:

[ERROR] agent.auth.handler: error authenticating:
error=
| Error making API request.
|
| URL: PUT http://x.x.svc:8200/v1/auth/kubernetes/login
| Code: 403. Errors:
|
| * permission denied
backoff=1s

values.yaml for Vault Helm Chart:

server:
  extraEnvironmentVars:
    GOOGLE_REGION: $some_value
    GOOGLE_PROJECT: $some_value
    GOOGLE_APPLICATION_CREDENTIALS: $some_value

  extraVolumes:
    - type: 'secret'
      name: '$some_value'

  dataStorage:
    enabled: true
    size: $some_value

  ha:
    enabled: true
    replicas: 3

  raft:
    enabled: true
    config: |
      ui = true

      storage "raft" {
        path = "/vault/data"
      }

      listener "tcp" {
        address     = "0.0.0.0:8200"
        cluster_addr  = "0.0.0.0:8201"
        tls_disable = "true"
      }

      seal "gcpckms" {
        credentials = "$some_value"
        project = "$some_value"
        region = "$some_value"
        key_ring = "$some_value"
        crypto_key = "$some_value"
      }

Your prompt attention to this matter is highly appreciated. Thank you for your cooperation.

This issue has been fixed. Please configure auth method as stated below.

Use local service account token as the reviewer JWT

When running Vault in a Kubernetes pod the recommended option is to use the pod’s local service account token. Vault will periodically re-read the file to support short-lived tokens. To use the local token and CA certificate, omit token_reviewer_jwt and kubernetes_ca_cert when configuring the auth method. Vault will attempt to load them from token and ca.crt respectively inside the default mount folder /var/run/secrets/kubernetes.io/serviceaccount/.

vault write auth/kubernetes/config \
    kubernetes_host=https://$KUBERNETES_SERVICE_HOST:$KUBERNETES_SERVICE_PORT

Note: Requires Vault 1.9.3+. In earlier versions the service account token and CA certificate is read once and stored in Vault storage. When the service account token expires or is revoked, Vault will no longer be able to use the TokenReview API and client authentication will fail.