[INFO] agent.auth.handler: authenticating [URL: PUT http://x.x.svc:8200/v1/auth/kubernetes/login|Code: 403|permission denied]

I trust this message finds you in good health. I am reaching out to provide insights into a recent incident involving our HashiCorp Vault deployment on our Kubernetes cluster.
We currently run three Vault nodes, consisting of one leader and two followers. An unexpected event occurred recently when node 2 automatically transitioned to become the leader of the cluster. Subsequently, applications encountered difficulties fetching secrets from the Vault server, resulting in permission denied errors.
After a thorough investigation and several hours of research and development, I successfully resolved the issue by reconfiguring the Kubernetes authentication for Vault using the command vault write auth/kubernetes/config.
While I was able to address the problem, it’s crucial for us to understand why this occurred and, more importantly, how we can prevent such incidents in the future. Given the critical nature of our production applications that rely on Vault, it is imperative to establish measures to avoid similar disruptions.
I would appreciate your insights and collaboration in devising a robust strategy to prevent and manage such occurrences, ensuring the stability and reliability of our Vault deployment.

vault-agent-init log:
2024-01-04T08:24:06.219Z [ERROR] agent.auth.handler: error authenticating:
error=
| Error making API request.
|
| URL: PUT http://x.x.svc:8200/v1/auth/kubernetes/login
| Code: 403. Errors:
|
| * permission denied
backoff=1s
2024-01-04T08:24:07.220Z [INFO] agent.auth.handler: authenticating

vaules.yaml for vault helm chart:
server:
extraEnvironmentVars:
GOOGLE_REGION: $some_value
GOOGLE_PROJECT: $some_value
GOOGLE_APPLICATION_CREDENTIALS: $some_value

extraVolumes:

  • type: ‘secret’
    name: ‘$some_value’
    dataStorage:
    enabled: true

    Size of the PVC created

    size: $some_value
    ha:
    enabled: true
    replicas: 3
    raft:
    enabled: true
    config: |
    ui = true

    storage "raft" {
      path = "/vault/data"
    }
    
    listener "tcp" {
    address     = "0.0.0.0:8200"
    cluster_addr  = "0.0.0.0:8201"
    tls_disable = "true"
    }
    seal "gcpckms" {
      credentials = "$some_value"
      project = "$some_value"
      region = "$some_value"
      key_ring = "$some_value"
      crypto_key = "$some_value"
    }
    

Note: I was able to fix above error by issuing below command.

Fix:
vault write auth/kubernetes/config
token_reviewer_jwt="(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" \ kubernetes_host=https://{KUBERNETES_PORT_443_TCP_ADDR}:443
kubernetes_ca_cert=@/var/run/secrets/kubernetes.io/serviceaccount/ca.crt

Thanks,

This issue has been fixed. Please configure auth method as stated below.

Use local service account token as the reviewer JWT

When running Vault in a Kubernetes pod the recommended option is to use the pod’s local service account token. Vault will periodically re-read the file to support short-lived tokens. To use the local token and CA certificate, omit token_reviewer_jwt and kubernetes_ca_cert when configuring the auth method. Vault will attempt to load them from token and ca.crt respectively inside the default mount folder /var/run/secrets/kubernetes.io/serviceaccount/.

vault write auth/kubernetes/config \
    kubernetes_host=https://$KUBERNETES_SERVICE_HOST:$KUBERNETES_SERVICE_PORT

Note: Requires Vault 1.9.3+. In earlier versions the service account token and CA certificate is read once and stored in Vault storage. When the service account token expires or is revoked, Vault will no longer be able to use the TokenReview API and client authentication will fail.