I tried to increase my k8s vault cluster just by increasing the number of replicas in the helm chart from 3 to 5. I seem to be getting a weird set up. I have pods from vault-0 all the way to vault-4. I am able to unseal all pods vault-0,vault-1,vault-2 and vault-4 and they all seem to be in the same cluster when I run vault operator raft list-peers. What I find weird is that vault-3 is telling me that the cluster is not initialized and it is not part of the raft list-peers list. I also noticed the same thing to pod vault-5 when I increased the replicas to 6. Any idea why this is happening?and how do I add those pods to the same cluster?
Not enough information provided to be able to help. A good start would be:
Vault version
Vault configuration file
Helm values
Helm chart version
Logs from failing Vault nodes
Thanks @maxb . Below are the requested information.
Vault version: docker.io/hashicorp/vault:1.10.3
HELM DETAILS
Helm Version: vault-0.20.1
Helm values overriden:
ha:
enabled: true
replicas: 5
raft:
# Enables Raft integrated storage
enabled: true
config: |
ui = true
listener "tcp" {
tls_disable = 1
address = "[::]:8200"
cluster_address = "[::]:8201"
# tls_cert_file = "/etc/ssl/vaultssl/vault-cert.pem"
# tls_key_file = "/etc/ssl/vaultssl/vault-key.pem"
# tls_disable_client_certs = "true"
telemetry {
unauthenticated_metrics_access = true
}
}
telemetry "prometheus" {
prometheus_retention_time = "30s"
disable_hostname = true
}
storage "raft" {
path = "/vault/data"
}
service_registration "kubernetes" {}
VAULT PODS
vault operator raft list-peers output from vault-1 pod:
When I try to unseal the vault-3 pod I get this:
According to the information you’ve shown, there are no instructions in the Vault config file telling Vault nodes to join an existing cluster.
Those would look something like:
# ...
storage "raft" {
# ...
retry_join {
leader_api_addr = "http://vault-0.vault-internal:8200"
}
retry_join {
leader_api_addr = "http://vault-1.vault-internal:8200"
}
retry_join {
leader_api_addr = "http://vault-2.vault-internal:8200"
}
}
# ...
Therefore it is expected behaviour that the new nodes don’t join.
You either need configuration similar to the above, or you need to be using an explicit vault operator raft join
command (consult this commands usage message) to trigger joining before initial unseal.
I have no idea how the vault-4 node managed to join without this!
Awesome. This really helped me out. Thanks. Weird I think I tried this before but it was out of the storage “raft” section.
Thanks @maxb