Seamless cluster auto join

Hi there

I’m deploying Vault with the latest Helm chart (as of 2022-9-6) on AWS EKS, with an auto unseal through AWS KMS. I configured auto join based on Kubernetes, it works

Except that on initial deployment it fails due to pods not being ready : of course the vault operator init isn’t done yet. I do that. Then I’ve to delete the additional pods to trigger a new auto join. This is what I’ld like to avoid, but can’t find a “retry” or alike setup : anyone know how ?

Here’s my values-override.yml

injector:
  enabled: false
  logLevel: info
  logFormat: json

server:
  logLevel: info
  logFormat: json
  # logLevel: debug
  # logFormat: standard

  # extraContainers:
  #   - name: debug
  #     image: alpine
  #     command: [sleep, infinity]
  # shareProcessNamespace: true

  # Default value make the 3rd pod unschedulable on 2 workers nodes EKS cluster
  affinity: ""
  topologySpreadConstraints:
    - maxSkew: 1
      topologyKey: kubernetes.io/hostname
      whenUnsatisfiable: ScheduleAnyway

  standalone:
    enabled: false

  ha:
    enabled: true
    replicas: 3
    raft:
      enabled: true
      setNodeId: true
      config: |
        ui = true

        listener "tcp" {
          tls_disable = 1
          address = "[::]:8200"
          cluster_address = "[::]:8201"
        }

        storage "raft" {
          path = "/vault/data"
          retry_join {
            # https://github.com/hashicorp/go-discover
            # https://github.com/hashicorp/go-discover/blob/master/provider/k8s/k8s_discover.go#L32-L59
            auto_join        = "provider=k8s namespace={{.Release.Namespace}} label_selector=\"app.kubernetes.io/name={{ template "vault.name" . }},component=server\""
            auto_join_scheme = "http"
          }
        }

        seal "awskms" {
          region      = "eu-central-1"
          kms_key_id  = "mrk-01234567890123456789"
        }

        service_registration "kubernetes" {}

  serviceAccount:
    name: vault
    annotations:
      eks.amazonaws.com/role-arn: arn:aws:iam::01234567890:role/k8s-kms-access

Which Vault version? There’s a known bug in 1.11.1 and 1.11.2

Indeed 1.11.2, the chart’s default. Will try with 1.11.3