SOLVED: Vault in HA with Raft - Issue joining

Hi! I’m usually not the one that begs for help in these forums, but I just have to admit that I am stuck and need assistance!

I am trying to setup Vault in HA mode with Raft storage and TLS using certs from my own CA (pfSense) and I created a certificate for Vault with the following info:

Subject Alternative Names: vault, vault.vault, vault.vault.svc, vault.vault.svc.cluster.local, vault-0.vault-internal, vault-1.vault-internal, vault-2.vault-internal, IP Address:127.0.0.1

I have verified that the vault.ca, vault.crt and vault.key files have been mounted inside the pods, and that the SAN names are present in those files.

My problem is that I can’t join a pod to raft:

❯ kubectl -n vault exec -ti vault-1 -- vault operator raft join http://vault-0.vault-internal:8200
Error joining the node to the Raft cluster: Error making API request.

URL: POST https://127.0.0.1:8200/v1/sys/storage/raft/join
Code: 500. Errors:

* failed to join raft cluster: timed out on raft join: %!w(<nil>)

I have tried tons of different IP addresses etc. in the VAULT_ADDR, address=, cluster_addr= and so on.

Can someone explain exactly which IPs or FQDNS should be used? I’m getting blind at the moment and have no idea anymore.

This is my listener:

 ha:
    enabled: true
    replicas: 2

    apiAddr: "https://127.0.0.1:8200"

    raft:
      enabled: true
      setNodeId: true

      config: |
        ui = true

        listener "tcp" {
          tls_disable = 0
          address = "[::]:8200"
          cluster_address = "[::]:8201"
          tls_cert_file = "/vault/userconfig/vault-server-tls/vault.crt"
          tls_key_file  = "/vault/userconfig/vault-server-tls/vault.key"
          tls_client_ca_file = "/vault/userconfig/vault-server-tls/vault.ca"             
        }

        disable_mlock = true

        storage "raft" {
          path = "/vault/data"
        }

        service_registration "kubernetes" {}

and these are my extra env. vars:

  extraEnvironmentVars:
    VAULT_CACERT: "/vault/userconfig/vault-server-tls/vault.ca"
    VAULT_ADDR: "https://127.0.0.1:8200"

This is output of my Vault Server configuration values:

==> Vault server configuration:

             Api Address: https://127.0.0.1:8200
                     Cgo: disabled
         Cluster Address: https://vault-0.vault-internal:8201
              Go Version: go1.17.7
              Listener 1: tcp (addr: "[::]:8200", cluster address: "[::]:8201", max_request_duration: "1m30s", max_request_size: "33554432", tls: "enabled")
               Log Level: trace
                   Mlock: supported: true, enabled: false
           Recovery Mode: false
                 Storage: raft (HA available)
                 Version: Vault v1.10.0
             Version Sha: 7738ec5d0d6f5bf94a809ee0f6ff0142cfa525a6

Any ideas what could be wrong??! I hope someone can specify exactly which IPs I should use.

A good first step would be to exec into a pod and confirm whether you can actually connect to the URL you’re passing here.

Based on the rest of your post, it seems you’re using https for your port 8200 listener, so the fact you have http: here is certainly a problem. There may be more beyond that.

Raft is a consensus-based system. Having 2 nodes is little better than having 1, as if either fail, the system will be unable to achieve consensus. 3 is the minimum number of replicas to gain any failure tolerance in a quorum-based system.

Thank you for your reply.

the http: was a command I had copied from the wrong terminal window, so I have been using https: for the join command.

I exec’ed into vault-1 and tried the following:

/ $ wget https://vault-0.vault-internal:8200
Connecting to vault-0.vault-internal:8200 (192.168.3.3:8200)
ssl_client: vault-0.vault-internal: certificate verification failed: unable to get local issuer certificate
wget: error getting response: Connection reset by peer

and did a log follow on vault-0:

2022-04-22T08:53:50.141Z [INFO]  http: TLS handshake error from 192.168.2.3:43002: remote error: tls: bad certificate
2022-04-22T08:55:13.195Z [INFO]  http: TLS handshake error from 192.168.3.3:56976: local error: tls: bad record MAC

As I mentioned earlier, my certificates are located in both pods:

❯ kubectl -n vault exec -ti vault-0 -- ls /vault/userconfig/vault-server-tls/
vault.ca   vault.crt  vault.key

❯ kubectl -n vault exec -ti vault-1 -- ls /vault/userconfig/vault-server-tls/
vault.ca   vault.crt  vault.key

The vault.ca contains my Intermediate CA info only, while vault.crt is a chained certificate with my vault certificate at the top section and my intermediate certificate at the bottom.

I’ll use 3 nodes when I get this thing up and running correctly.

But I think there’s either a cert issue or I’m using the wrong URL/IP/FQDN somewhere.

This is a decode of my vault.crt file using sslshopper.com decode:

If anyone has the time and feels like digging a bit deeper into this, here’s my Helm values file. There are some leftovers from testing Ingress and other stuff, but it’s left in there intentionally just in case I want to enable it at some point.

global:
  enabled: true
  imagePullSecrets: []
  tlsDisable: false
  openshift: false
  psp:
    enable: false
    annotations: |
      seccomp.security.alpha.kubernetes.io/allowedProfileNames: docker/default,runtime/default
      apparmor.security.beta.kubernetes.io/allowedProfileNames: runtime/default
      seccomp.security.alpha.kubernetes.io/defaultProfileName:  runtime/default
      apparmor.security.beta.kubernetes.io/defaultProfileName:  runtime/default

injector:
  enabled: false
  replicas: 1

  port: 8080
  leaderElector:
    enabled: true

  metrics:
    enabled: false

  externalVaultAddr: ""

  image:
    repository: "hashicorp/vault-k8s"
    tag: "0.14.2"
    pullPolicy: IfNotPresent

  agentImage:
    repository: "hashicorp/vault"
    tag: "1.9.3"

  agentDefaults:
    cpuLimit: "500m"
    cpuRequest: "250m"
    memLimit: "128Mi"
    memRequest: "64Mi"
    template: "map"
    templateConfig:
      exitOnRetryFailure: true
      staticSecretRenderInterval: ""

  authPath: "auth/kubernetes"
  logLevel: "trace"
  logFormat: "standard"
  revokeOnShutdown: false
  
  webhook: 
    failurePolicy: Ignore
    matchPolicy: Exact
    timeoutSeconds: 30
    namespaceSelector: {}
    objectSelector: {}
    annotations: {}

  failurePolicy: Ignore
  namespaceSelector: {}
  objectSelector: {}
  webhookAnnotations: {}

  certs:
    secretName: "vault.hko.lab-tls"
    caBundle: ""
    certName: tls.crt
    keyName: tls.key

  resources: {}
  extraEnvironmentVars: {}

  affinity: |
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchLabels:
              app.kubernetes.io/name: {{ template "vault.name" . }}-agent-injector
              app.kubernetes.io/instance: "{{ .Release.Name }}"
              component: webhook
          topologyKey: kubernetes.io/hostname

  tolerations: |
    - key: 'node-role.kubernetes.io/master'
      operator: "Exists"
      effect: NoSchedule

  nodeSelector: {}
  priorityClassName: ""
  annotations: {}
  extraLabels: {}

  
  
  hostNetwork: false

  service:
    annotations: {}

  podDisruptionBudget: {}
  strategy: {}

server:
  enabled: true
  enterpriseLicense:
    secretName: ""
    secretKey: "license"

  image:
    repository: "hashicorp/vault"
    tag: "latest"
    pullPolicy: IfNotPresent

  updateStrategyType: "OnDelete"

  logLevel: "trace"
  logFormat: ""
  resources: {}

  ingress:
    enabled: false
    labels: {}
      
    annotations: {}
    ingressClassName: "nginx"
    pathType: Prefix

    activeService: false
    hosts:
      - host: "vault.hko.lab"
        paths: []

    extraPaths: []

    tls:
      - secretName: vault.hko.lab-tls
        hosts:

          - vault.hko.lab
          - vault
          - vault.local
          - vault-internal
          - vault.svc.cluster.local
          - vault.vault-internal
          - vault.vault-internal.vault.svc
          - vault.vault-internal.vault.svc.cluster.local
          - vault-0.vault-internal
          - vault-1.vault-internal
          - vault-2.vault-internal
          - vault-0.vault-internal.vault
          - vault-1.vault-internal.vault
          - vault-2.vault-internal.vault
          - vault-0.vault-internal.vault.svc
          - vault-1.vault-internal.vault.svc
          - vault-2.vault-internal.vault.svc
          - vault-0.vault-internal.vault.svc.cluster.local
          - vault-1.vault-internal.vault.svc.cluster.local
          - vault-2.vault-internal.vault.svc.cluster.local

  route:
    enabled: false
    activeService: true

    labels: {}
    annotations: {}
      
    host: vault.hko.lab

    tls:
      termination: passthrough

  authDelegator:
    enabled: true

  extraInitContainers: null
  extraContainers: null
  shareProcessNamespace: false
  extraArgs: ""

  readinessProbe:
    enabled: true
    failureThreshold: 2
    initialDelaySeconds: 5
    periodSeconds: 5
    successThreshold: 1
    timeoutSeconds: 3
  
  livenessProbe:
    enabled: false
    path: "/v1/sys/health?standbyok=true"
    failureThreshold: 2
    initialDelaySeconds: 60
    periodSeconds: 5
    successThreshold: 1
    timeoutSeconds: 3

  terminationGracePeriodSeconds: 10
  preStopSleepSeconds: 5

  postStart: []
  
  extraEnvironmentVars:
    VAULT_CACERT: "/vault/userconfig/vault-server-tls/vault.ca"
    VAULT_ADDR: "https://127.0.0.1:8200"

  extraSecretEnvironmentVars: []
  extraVolumes: []

  volumes:
  - name: vault-server-tls
    secret:
      secretName: "vault.hko.lab-tls"

  volumeMounts:
    - name: vault-server-tls
      mountPath: /vault/userconfig/vault-server-tls/vault.crt
      subPath: vault.crt
      readOnly: true
    - name: vault-server-tls
      mountPath: /vault/userconfig/vault-server-tls/vault.key
      subPath: vault.key
      readOnly: true
    - name: vault-server-tls
      mountPath: /vault/userconfig/vault-server-tls/vault.ca
      subPath: vault.ca
      readOnly: true

  affinity: |
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchLabels:
              app.kubernetes.io/name: {{ template "vault.name" . }}
              app.kubernetes.io/instance: "{{ .Release.Name }}"
              component: server
          topologyKey: kubernetes.io/hostname

  tolerations: 
  nodeSelector: {}

  networkPolicy:
    enabled: false
    egress: []

  priorityClassName: ""

  extraLabels: {}

  annotations: {}

  service:
    enabled: true
    type: ClusterIP
    publishNotReadyAddresses: true
    externalTrafficPolicy: Cluster
    port: 8200
    targetPort: 8200
    annotations: {}

  dataStorage:
    enabled: true
    size: 10Gi
    mountPath: "/vault/data"
    storageClass: kubernetes-gs01-storage-policy
    accessMode: ReadWriteOnce
    annotations: {}

  auditStorage:
    enabled: false
    size: 10Gi
    mountPath: "/vault/audit"
    storageClass: kubernetes-gs01-storage-policy
    accessMode: ReadWriteOnce
    annotations: {}

  dev:
    enabled: false
    devRootToken: "root"

  standalone:
    enabled: false

    config: |
      ui = true

      listener "tcp" {
        tls_disable = 0
        address = "[::]:8200"
        cluster_address = "[::]:8200"
        api_addr = "[::]:8200"
        tls_cert_file = "/vault/userconfig/vault-server-tls/vault.crt"
        tls_key_file  = "/vault/userconfig/vault-server-tls/vault.key"
        tls_client_ca_file = "/vault/userconfig/vault-server-tls/vault.ca"   
      }
      storage "file" {
        path = "/vault/data"
      }

  ha:
    enabled: true
    replicas: 3
    apiAddr: "https://127.0.0.1:8200"

    raft:
      enabled: true
      setNodeId: true

      config: |
        ui = true

        listener "tcp" {
          tls_disable = 0
          address = "[::]:8200"
          cluster_address = "[::]:8201"
          tls_cert_file = "/vault/userconfig/vault-server-tls/vault.crt"
          tls_key_file  = "/vault/userconfig/vault-server-tls/vault.key"
          tls_client_ca_file = "/vault/userconfig/vault-server-tls/vault.ca"             
        }

        disable_mlock = true
        log_level = "Trace"

        storage "raft" {
          path = "/vault/data"
        }

        service_registration "kubernetes" {}

    config: |
      ui = true
      api_addr = "[::]:8200"

      listener "tcp" {
        tls_disable = 0
        address = "[::]:8200"
        cluster_address = "[::]:8201"
        tls_cert_file = "/vault/userconfig/vault-server-tls/vault.crt"
        tls_key_file  = "/vault/userconfig/vault-server-tls/vault.key"
        tls_client_ca_file = "/vault/userconfig/vault-server-tls/vault.ca"   
      }
      storage "raft" {
        path = "/vault/data"
        
      }

      service_registration "kubernetes" {}
    
    disruptionBudget:
      enabled: true
      maxUnavailable: null

  serviceAccount:
    create: true
    name: ""
    annotations: {}

  statefulSet:
    annotations: {}

ui:
  enabled: true
  publishNotReadyAddresses: true
  activeVaultPodOnly: false
  serviceType: "ClusterIP"
  serviceNodePort: null
  externalPort: 8200
  targetPort: 8200
  externalTrafficPolicy: Cluster
  annotations: {}

csi:
  enabled: false

  image:
    repository: "hashicorp/vault-csi-provider"
    tag: "1.0.0"
    pullPolicy: IfNotPresent

  volumes: null
  volumeMounts: null
  resources: {}

  daemonSet:
    updateStrategy:
      type: RollingUpdate
      maxUnavailable: ""

    annotations: {}
    providersDir: "/etc/kubernetes/secrets-store-csi-providers"
    kubeletRootDir: "/var/lib/kubelet"
    extraLabels: {}

  pod:
    annotations: {}
    tolerations: []
    extraLabels: {}
  priorityClassName: ""

  serviceAccount:
    annotations: {}
    extraLabels: {}

  readinessProbe:
    failureThreshold: 2
    initialDelaySeconds: 5
    periodSeconds: 5
    successThreshold: 1
    timeoutSeconds: 3
  
  livenessProbe:
    failureThreshold: 2
    initialDelaySeconds: 5
    periodSeconds: 5
    successThreshold: 1
    timeoutSeconds: 3

  debug: false

  extraArgs: []

I’ve never actually used the Helm chart myself so I don’t have any ready comments about that, but it strikes me that the problem might be that your custom CA isn’t trusted by the part of the Vault server invoked by vault operator raft join.

You should look into getting your custom CA trusted by the default Go on Linux SSL CA code. I don’t know how best exactly to do this on K8s, but these files show the locations and algorithms Go uses:

You should look at the logs not just of the target node of the join (vault-0), but also where you’re initating the join from (vault-1).

And, lastly, you might consider just turning off TLS entirely in your experimental environment, to remove one source of complexity from the problem.

I disabled TLS and I was able to join vault-1 to the raft cluster…

Actually, I just also noticed that the documentation on Highly Available Vault Cluster with Raft | Vault by HashiCorp is probably wrong.

Here they say you should first initialise and unseal vault-0, and then run the join command for vault-1 followed by unseal.

I had to do the opposite for vault-1. I had to first unseal it and then do the join command.

This part:

kubectl exec -ti vault-1 -- vault operator raft join http://vault-0.vault-internal:8200
kubectl exec -ti vault-1 -- vault operator unseal

…had to be done like this:

kubectl exec -ti vault-1 -- vault operator unseal
kubectl exec -ti vault-1 -- vault operator raft join http://vault-0.vault-internal:8200

So yes, maybe you’re right, that my CA isn’t trusted… Strange that I can’t find anyone else struggling with this though… I imagine that more people are using private CA’s, at least for testing.

There must be something else at work here then, as you can’t unseal a node which hasn’t got an encrypted master key to decrypt, and it won’t have that until after it’s replicated that following the join - i.e. the docs have the join and unseal in the intended order.

1 Like

Then I’m completely lost. I’ll abandon using the Helm chart for setting up Vault, for the time being. :slight_smile: Thank you for helping me out! Appreciate it

Typical… a day after posting this I managed to get it working…

(The complete values file is at the bottom of this post)

The following commands use the -leader-ca-cert= and -ca-cert= to specify the correct certificates. Note the “@” in the path of the leader-ca-cert parameter.

❯ helm install -f my_values.yaml vault hashicorp/vault --namespace vault

# Open a new terminal window and follow the logs of vault-0 pod:
❯ kubectl -n vault logs --follow vault-0 -n vault

❯ kubectl -n vault exec -ti vault-0 -- vault operator init -key-shares=1 -key-threshold=1
Unseal Key 1: REDACTED

Initial Root Token: REDACTED

❯ kubectl -n vault exec --stdin=true --tty=true vault-0 -- vault operator unseal
Unseal Key (will be hidden):
Key                     Value
---                     -----
Seal Type               shamir
Initialized             true
Sealed                  false

# If you get an error on the following command please retry after 10-30 seconds:
❯ kubectl -n vault exec -ti vault-1  -- vault operator raft join -leader-ca-cert=@/vault/userconfig/vault-server-tls/vault.ca https://vault-0.vault-internal:8200
Key       Value
---       -----
Joined    true

❯ kubectl -n vault exec --stdin=true --tty=true vault-1 -- vault operator unseal
Unseal Key (will be hidden):

❯ kubectl -n vault exec -ti vault-1 -- vault status
Key                     Value
---                     -----
Seal Type               shamir
Initialized             true
Sealed                  false

❯ kubectl -n vault exec -ti vault-0 -- vault login
Token (will be hidden):
Success! You are now authenticated.

❯ kubectl -n vault exec -ti vault-0 -- vault operator raft list-peers -ca-cert=/vault/userconfig/vault-server-tls/vault.ca
Node                                    Address                        State       Voter
----                                    -------                        -----       -----
vault-0    vault-0.vault-internal:8201    leader      true
vault-1    vault-1.vault-internal:8201    follower    true

Working Helm chart:

---
global:
  enabled: true
  imagePullSecrets: []
  tlsDisable: false
  openshift: false
  psp:
    enable: false
    annotations: |
      seccomp.security.alpha.kubernetes.io/allowedProfileNames: docker/default,runtime/default
      apparmor.security.beta.kubernetes.io/allowedProfileNames: runtime/default
      seccomp.security.alpha.kubernetes.io/defaultProfileName:  runtime/default
      apparmor.security.beta.kubernetes.io/defaultProfileName:  runtime/default
injector:
  enabled: false
  replicas: 1
  port: 8080
  leaderElector:
    enabled: true
  metrics:
    enabled: false
  externalVaultAddr: ''
  image:
    repository: hashicorp/vault-k8s
    tag: 0.14.2
    pullPolicy: IfNotPresent
  agentImage:
    repository: hashicorp/vault
    tag: 1.9.3
  agentDefaults:
    cpuLimit: 500m
    cpuRequest: 250m
    memLimit: 128Mi
    memRequest: 64Mi
    template: map
    templateConfig:
      exitOnRetryFailure: true
      staticSecretRenderInterval: ''
  authPath: auth/kubernetes
  logLevel: info
  logFormat: standard
  revokeOnShutdown: false
  webhook:
    failurePolicy: Ignore
    matchPolicy: Exact
    timeoutSeconds: 30
    namespaceSelector: {}
    objectSelector: {}
    annotations: {}
  failurePolicy: Ignore
  namespaceSelector: {}
  objectSelector: {}
  webhookAnnotations: {}
  certs:
    secretName: null
    caBundle: ''
    certName: tls.crt
    keyName: tls.key
  resources: {}
  extraEnvironmentVars: {}
  affinity: |
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchLabels:
              app.kubernetes.io/name: {{ template "vault.name" . }}-agent-injector
              app.kubernetes.io/instance: "{{ .Release.Name }}"
              component: webhook
          topologyKey: kubernetes.io/hostname
  tolerations: []
  nodeSelector: {}
  priorityClassName: ''
  annotations: {}
  extraLabels: {}
  hostNetwork: false
  service:
    annotations: {}
  podDisruptionBudget: {}
  strategy: {}
server:
  enabled: true
  enterpriseLicense:
    secretName: ''
    secretKey: license
  image:
    repository: hashicorp/vault
    tag: 1.9.3
    pullPolicy: IfNotPresent
  updateStrategyType: OnDelete
  logLevel: trace
  logFormat: ''
  resources: {}
  ingress:
    enabled: false
    labels: {}
    annotations: {}
    ingressClassName: ''
    pathType: Prefix
    activeService: true
    hosts:
      - host: chart-example.local
        paths: []
    extraPaths: []
    tls: []
  route:
    enabled: false
    activeService: true
    labels: {}
    annotations: {}
    host: chart-example.local
    tls:
      termination: passthrough
  authDelegator:
    enabled: true
  extraInitContainers: null
  extraContainers: null
  shareProcessNamespace: false
  extraArgs: ''
  readinessProbe:
    enabled: true
    failureThreshold: 2
    initialDelaySeconds: 5
    periodSeconds: 5
    successThreshold: 1
    timeoutSeconds: 3
  livenessProbe:
    enabled: false
    path: /v1/sys/health?standbyok=true
    failureThreshold: 2
    initialDelaySeconds: 60
    periodSeconds: 5
    successThreshold: 1
    timeoutSeconds: 3
  terminationGracePeriodSeconds: 10
  preStopSleepSeconds: 5
  postStart: []
  extraEnvironmentVars:
    VAULT_ADDR: https://127.0.0.1:8200
    VAULT_CACERT: /vault/userconfig/vault-server-tls/vault.ca
    # VAULT_CAPATH: /vault/userconfig/vault-server-tls
    # VAULT_CLIENT_CERT: /vault/userconfig/vault-server-tls/vault.crt
    # VAULT_CLIENT_KEY: /vault/userconfig/vault-server-tls/vault.key
  extraSecretEnvironmentVars: []
  extraVolumes: []
  volumes:
    - name: vault-server-tls
      secret:
        secretName: vault.hko.lab-tls
  volumeMounts:
    - name: vault-server-tls
      mountPath: /vault/userconfig/vault-server-tls/vault.crt
      subPath: vault.crt
      readOnly: true
    - name: vault-server-tls
      mountPath: /vault/userconfig/vault-server-tls/vault.key
      subPath: vault.key
      readOnly: true
    - name: vault-server-tls
      mountPath: /vault/userconfig/vault-server-tls/vault.ca
      subPath: vault.ca
      readOnly: true
  affinity: |
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchLabels:
              app.kubernetes.io/name: {{ template "vault.name" . }}
              app.kubernetes.io/instance: "{{ .Release.Name }}"
              component: server
          topologyKey: kubernetes.io/hostname
  tolerations: []
  nodeSelector: {}
  networkPolicy:
    enabled: false
    egress: []
  priorityClassName: ''
  extraLabels: {}
  annotations: {}
  service:
    enabled: true
    publishNotReadyAddresses: true
    externalTrafficPolicy: Cluster
    port: 8200
    targetPort: 8200
    annotations: {}
  dataStorage:
    enabled: true
    size: 10Gi
    mountPath: /vault/data
    storageClass: kubernetes-gs01-storage-policy
    accessMode: ReadWriteOnce
    annotations: {}
  auditStorage:
    enabled: false
    size: 10Gi
    mountPath: /vault/audit
    storageClass: null
    accessMode: ReadWriteOnce
    annotations: {}
  dev:
    enabled: false
    devRootToken: root
  standalone:
    enabled: false
    config: |
      ui = true

      listener "tcp" {
        tls_disable = 0
        address = "[::]:8200"
        cluster_address = "[::]:8201"
      }
      storage "file" {
        path = "/vault/data"
      }
  ha:
    enabled: true
    replicas: 3
    apiAddr: null
    clusterAddr: null
    raft:
      enabled: true
      setNodeId: true
      config: |
        ui = true

        listener "tcp" {
          tls_disable = 0
          address = "[::]:8200"
          cluster_address = "[::]:8201"
          tls_cert_file = "/vault/userconfig/vault-server-tls/vault.crt"
          tls_key_file  = "/vault/userconfig/vault-server-tls/vault.key"
          tls_client_ca_file = "/vault/userconfig/vault-server-tls/vault.ca"           
        }

        storage "raft" {
          path = "/vault/data"
        }

        service_registration "kubernetes" {}
    config: |
      ui = true

      listener "tcp" {
        tls_disable = 1
        address = "[::]:8200"
        cluster_address = "[::]:8201"
      }
      storage "consul" {
        path = "vault"
        address = "HOST_IP:8500"
      }

      service_registration "kubernetes" {}
    disruptionBudget:
      enabled: true
      maxUnavailable: null
  serviceAccount:
    create: true
    name: ''
    annotations: {}
  statefulSet:
    annotations: {}
ui:
  enabled: true
  publishNotReadyAddresses: true
  activeVaultPodOnly: false
  serviceType: LoadBalancer
  serviceNodePort: null
  externalPort: 8200
  targetPort: 8200
  externalTrafficPolicy: Cluster
  annotations: {}
csi:
  enabled: false
  image:
    repository: hashicorp/vault-csi-provider
    tag: 1.0.0
    pullPolicy: IfNotPresent
  volumes: null
  volumeMounts: null
  resources: {}
  daemonSet:
    updateStrategy:
      type: RollingUpdate
      maxUnavailable: ''
    annotations: {}
    providersDir: /etc/kubernetes/secrets-store-csi-providers
    kubeletRootDir: /var/lib/kubelet
    extraLabels: {}
  pod:
    annotations: {}
    tolerations: []
    extraLabels: {}
  priorityClassName: ''
  serviceAccount:
    annotations: {}
    extraLabels: {}
  readinessProbe:
    failureThreshold: 2
    initialDelaySeconds: 5
    periodSeconds: 5
    successThreshold: 1
    timeoutSeconds: 3
  livenessProbe:
    failureThreshold: 2
    initialDelaySeconds: 5
    periodSeconds: 5
    successThreshold: 1
    timeoutSeconds: 3
  debug: false
  extraArgs: []
2 Likes

How did you manage to make it work, we face same exact issue, please advise.

Did you remember the “@“ in front of the vault.ca, like you see below?

Yes, not working.

For internal pod communication we use self-signed certs

/vault/userconfig/vault-internal $ vault operator raft join -leader-ca-cert=@ca.crt https://vault-test-0.vault-test-internal:8200
Error joining the node to the Raft cluster: Error making API request.

URL: POST https://127.0.0.1:8200/v1/sys/storage/raft/join
Code: 500. Errors:

  • failed to join raft cluster: failed to get raft challenge

if I added “-tls-skip-verify” it works but pod is not listed as peer, also we use awskms for auto unseal, so can not use command “vault operator unseal”, when I restart the pod its still not joining.

tried below as well, not working

/vault/userconfig/vault-internal $ vault operator raft join -leader-ca-cert=@/vault/userconfig/vault-internal/ca.crt -address=https://xcr-vault-test-0.xcr-vault-test-internal:8200

Error joining the node to the Raft cluster: Post “https://xcr-vault-test-0.xcr-vault-test-internal:8200/v1/sys/storage/raft/join”: x509: certificate is not valid for any names, but wanted to match xcr-vault-test-0.xcr-vault-test-internal

Massive thank you for this. The command to include the ca certificate while executing a join command saved my bacon today!

Certs provided via helm values, although I plan to change this to volumes as extraVolumes is deprecated.
extraVolumes:

  • type: secret
    name: vault-tls-secret

My join command:
vault operator raft join -leader-ca-cert=@/vault/userconfig/vault-tls-secret/ca.crt https://vault-0.vault-internal:8200