A bit confused on how to set `service_registration` in HA mode (Helm chart)

Greetings,

I have semi-successfully deployed Vault, but I was surprised to find out that I had to run the vault operator init command on all of the replica pods rather than just one, in order to get them to the Running state. My question below might be related to this; or perhaps rather than exec ing the init command directly on the pods, I should’ve done it using vault cli and through the service?

Anyway,
The Helm chart documentation states:

If ha is enabled the Ingress will point to the active vault server via the active Service. This requires vault 1.4+ and service_registration to be set in the vault config.

I have server.ha.enabled set to true, and have service_registration "kubernetes" {} in the server configuration.

What I do not understand is that, according to the documentation a pod name needs to be set via pod_name. But wouldn’t that fixate the service to just that one pod?

In an HA deployment, how can I get loan-balancing behaviour without setting anything statically?

Don’t do this - you’ve just accidentally created multiple standalone single-node clusters, that won’t share data with each other.

What documentation says that?

Yep, running init on multiple replicas never felt quite right…
So I’m guessing what I should’ve done is to run it on one of the replicas, and delete the other replica pods so that they come back up, hopefully into a running state?

It doesn’t really say it, it’s more a case of it doesn’t say enough :slight_smile:
It’s that bit I quoted from the documentation above. In this case, is an empty service_registration config good, then? Because the other alternative would be to set a pod_name and that would fix it to one pod, right?

OK, let’s break this down… there are too many layers of things unknown or going wrong here.

First, you need to say which storage backend your Vault is configured to use, because that has a huge effect on how the HA works - Vault HA is mediated through the storage backend.

Second, I wonder whether you’re mixing up ‘running’ and ‘ready’, which mean different things in Kubernetes. Vault pods should progress to ‘running’ automatically. If they don’t, something is wrong, and vault operator init would not fix it. However the pods don’t become ‘ready’ until they are unsealed, which can only happen after the cluster is initialized.

Third, once all of that is out of the way, you seem to be tying yourself in unnecessary knots questioning the default service_registration "kubernetes" {} configuration - just leave it as is, relevant environment variables to go with it are already provided by the Helm chart.

Raft. I’ll provide the .tf snippet with all the value overrides below.

Hmm no, not in my case; the get pod output showed 0/1 until I issed the init command. Although I indeed mixed up Ready with Running, sorry about that.
So initially, right after the deployment:

$ kubectl -n vault logs vault-0

...
2023-04-12T06:29:50.945Z [INFO]  core: security barrier not initialized
2023-04-12T06:29:50.945Z [INFO]  core.autoseal: seal configuration missing, but cannot check old path as core is sealed: seal_type=recovery
2023-04-12T06:29:51.878Z [INFO]  core: stored unseal keys supported, attempting fetch
2023-04-12T06:29:51.878Z [WARN]  failed to unseal core: error="stored unseal keys are supported, but none were found"

Until I run kubectl -n vault exec -it vault-0 -- vault operator init. Only then:

vault-0                          1/1     Running   0          10m

So why then, in three places on the helm configuration documentation as I linked above, does it mention ’ This requires vault 1.4+ and service_registration to be set in the vault config’? This is the confusing bit. This is the 3rd time I’m trying to clarify this specific thing in this thread; I am pretty sure ‘just leave it at default’ does not have equal meaning as ‘this requires the service_registration to be set in the vault config’.

BTW, I’m on v0.24.0 of the chart, the latest as of now.

My vault server configuration TF template:

ui = ${enable_ui}

listener "tcp" {
  tls_disable     = ${listener_tls_disable}
  address         = "[::]:8200"
  cluster_address = "[::]:8201"
  # Enable unauthenticated metrics access (necessary for Prometheus Operator)
  #telemetry {
  #  unauthenticated_metrics_access = "true"
  #}
}

storage "raft" {
  path = "/vault/data"
}

seal "awskms" {
  region     = "${region_name}"
  kms_key_id = "${unseal_key_id}"
}

service_registration "kubernetes" {}

Perhaps best if I removed that service_registration "kubernetes" {} bit at the bottom, then?

And below is the complete Helm chart configuration I’m using, in form of TF code:

esource "helm_release" "vault" {
  name       = "vault"
  repository = "https://helm.releases.hashicorp.com"
  version    = var.chart_versions["vault"]
  namespace  = var.deployment_namespace
  chart      = "vault"

  # Chart setting "csi.enabled" establishes a dependency on the secrets-store-csi-driver
  depends_on = [
    helm_release.secrets-store-csi-driver
  ]

  # Disabled since we'll have the secrets-store-csi-driver doing the secrets injection
  set {
    name  = "injector.enabled"
    value = false
  }

  set {
    name  = "server.updateStrategyType"
    value = "RollingUpdate"
  }

  set {
    name  = "server.logLevel"
    value = "info"
  }

  set {
    name  = "server.resources.requests.cpu"
    value = "0.5"
  }
  set {
    name  = "server.resources.limits.cpu"
    value = "0.5"
  }
  set {
    name  = "server.resources.requests.memory"
    value = "1Gi"
  }
  set {
    name  = "server.resources.limits.memory"
    value = "1Gi"
  }


  # We want to make Vault an externally accessible service,
  # through Traefik just like any other service.
  set {
    name  = "server.service.type"
    value = "NodePort" # default: ClusterIP
  }

  set {
    name  = "server.standalone.enabled"
    value = false
  }
  set {
    name  = "server.ha.enabled"
    value = true
  }
  set {
    name  = "server.ha.replicas"
    value = 3
  }
  set {
    name  = "server.ha.raft.enabled"
    value = true
  }
  set {
    name  = "server.datastorage.size"
    value = "8Gi"
  }

  set {
    name  = "server.disruptionBudget.maxUnavailable"
    value = 1
  }

  set {
    name  = "server.auditStorage.enabled"
    value = true
  }
  set {
    name  = "server.auditStorage.size"
    value = "16Gi"
  }

  set {
    name  = "ui.enabled"
    value = true
  }
  set {
    name  = "ui.serviceType"
    value = "NodePort"
  }

set {
    name  = "server.ha.raft.config"
    value = local.vault_server_config
  }

set {
    name  = "server.serviceAccount.annotations"
    value = <<EOS
      "eks.amazonaws.com/role-arn": "${aws_iam_role.vault-server.arn}"
    EOS
  }

  set {
    name  = "csi.enabled"
    value = true
  }
}

Thanks for looking into it!

OK, so you’re using Raft. You’ve successfully initialized the cluster on one node. Now you need to join the other nodes to the cluster.

Vault HA cluster with integrated storage | Vault | HashiCorp Developer describes using the vault operator raft join command to manually provide the API URL of an existing node, to a new node.

The next section, Vault HA cluster with integrated storage | Vault | HashiCorp Developer, describes the alternative - listing sufficient API URLs of other nodes in retry_join blocks within the configuration file, so they can find each other without requiring manual vault operator raft join commands. (Note, these retry_join blocks can be left in the configuration afterwards - they are only used to guide initial cluster formation.)

Once you have chosen and performed one of these methods, you should have all of your pods achieving ‘ready’.

Most users of the Vault Helm chart probably use a configuration looking somewhat similar to

        retry_join {
          leader_api_addr = "http://vault-0.vault-internal:8200"
        }
        retry_join {
          leader_api_addr = "http://vault-1.vault-internal:8200"
        }
        retry_join {
          leader_api_addr = "http://vault-2.vault-internal:8200"
        }

Presumably this is just a mild warning to not remove the service_registration "kubernetes" {} which is present in the default Vault server configuration shipped with the Helm chart. Which you haven’t removed, it’s still in the config you pasted - good.

Um, no. Quite the opposite.

Thanks! Can confirm it now works with the retry_join configuration added; then I only had to perform the init on one of the replicas. Logs are looking good.

This making the cluster to go in standby mode. I want the cluster in active mode. Can someone please suggest on this