i build a minimal docker image to run vault (for learning), and i am successful in launching the vault server
. I have created a 3 replica statefulset each of them running the below config. However, i seem to persistently run into a bad certificate
error when joining vault-1
with vault-0
(where vault was initialized).
Any tips will be appreciated!
error from vault-1
: failed to verify certificate: x509: certificate signed by unknown authority
config.hcl
ui = true
cluster_name = "vault-internal"
# HTTP listener with TLS enforced
listener "tcp" {
address = "[::]:8200"
cluster_address = "[::]:8201"
tls_cert_file = "/vault/tls/vault.crt"
tls_key_file = "/vault/tls/vault.key"
tls_client_ca_file = "/vault/tls/vault.ca"
tls_require_and_verify_client_cert = false
tls_disable_client_certs = false
}
# Vault runs in HA mode
storage "raft" {
path = "/vault/data"
node_id = "HOSTNAME_PLACEHOLDER" # initContainer will replace this with hostname (vault-0 | vault-1 | vault-2)
# Recommended: enable raft TLS
retry_join {
leader_api_addr = "https://vault-0.vault.securities.svc.cluster.local:8200"
leader_ca_cert_file = "/vault/tls/vault.ca"
leader_client_cert_file = "/vault/tls/vault.crt"
leader_client_key_file = "/vault/tls/vault.key"
}
}
service_registration "kubernetes" {
namespace = "securities"
}
# Advertise addresses to other nodes
api_addr = "https://HOSTNAME_PLACEHOLDER.vault.securities.svc.cluster.local:8200"
cluster_addr = "https://HOSTNAME_PLACEHOLDER.vault.securities.svc.cluster.local:8201"
# Logging
log_level = "info"
# Enable memory locking (if supported by OS)
disable_mlock = true
The way i generated the self signed certs were from hashicorp docs itself here. All the way until storing it into kubernetes secrets.. I have also verified that the 3 secrets have been mounted in /vault/tls/
folder in all 3 pods.
vault-0
initializes well without any error. vault status
:
Key Value
--- -----
Seal Type shamir
Initialized true
Sealed false
Total Shares 5
Threshold 3
Version 1.19.4
Build Date 2025-05-01T07:03:55Z
Storage Type raft
Cluster Name vault-internal
Cluster ID 9f82ab44-7cf3-c37f-7810-03470739c8b2
Removed From Cluster false
HA Enabled true
HA Cluster https://127.0.0.1:8201
HA Mode active
Active Since 2025-05-05T11:11:30.213413817Z
Raft Committed Index 39
Raft Applied Index 39
vault-1
is not able to join vault-0
- logs from vault-1
:
2025-05-05T11:12:16.082Z [INFO] core: security barrier not initialized
2025-05-05T11:12:16.082Z [INFO] core: attempting to join possible raft leader node: leader_addr=https://vault-0.vault.securities.svc.cluster.local:8200
2025-05-05T11:12:16.088Z [ERROR] core: failed to retry join raft cluster: retry=2s err="waiting for unseal keys to be supplied"
2025-05-05T11:12:16.613Z [INFO] core: security barrier not initialized
2025-05-05T11:12:16.613Z [INFO] core: security barrier not initialized
2025-05-05T11:12:16.613Z [INFO] core: attempting to join possible raft leader node: leader_addr=https://vault-0.vault.securities.svc.cluster.local:8200
2025-05-05T11:12:16.620Z [ERROR] core: failed to get raft challenge: leader_addr=https://vault-0.vault.securities.svc.cluster.local:8200 error="error during raft bootstrap init call: Put \"https://vault-0.vault.securities.svc.cluster.local:8200/v1/sys/storage/raft/bootstrap/challenge\": tls: failed to verify certificate: x509: certificate signed by unknown authority"
2025-05-05T11:12:16.620Z [ERROR] core: failed to join raft cluster: error="failed to get raft challenge"
2025-05-05T11:12:18.088Z [INFO] core: attempting to join possible raft leader node: leader_addr=https://vault-0.vault.securities.svc.cluster.local:8200
2025-05-05T11:12:18.094Z [ERROR] core: failed to retry join raft cluster: retry=2s err="waiting for unseal keys to be supplied"
2025-05-05T11:12:20.094Z [INFO] core: security barrier not initialized
2025-05-05T11:12:20.094Z [INFO] core: attempting to join possible raft leader node: leader_addr=https://vault-0.vault.securities.svc.cluster.local:8200
2025-05-05T11:12:20.099Z [ERROR] core: failed to retry join raft cluster: retry=2s err="waiting for unseal keys to be supplied"
I have also checked that all 3 pods nslookup
is reachable:
nslookup vault
Server: 10.96.0.10
Address: 10.96.0.10:53
** server can't find vault.cluster.local: NXDOMAIN
** server can't find vault.cluster.local: NXDOMAIN
** server can't find vault.svc.cluster.local: NXDOMAIN
Name: vault.securities.svc.cluster.local
Address: 10.1.1.150
Name: vault.securities.svc.cluster.local
Address: 10.1.1.149
Name: vault.securities.svc.cluster.local
Address: 10.1.1.151
the statefulsets pod FQDN is also reachable:
vault-0:/vault# nslookup vault-1.vault.securities.svc.cluster.local
Server: 10.96.0.10
Address: 10.96.0.10:53
Name: vault-1.vault.securities.svc.cluster.local
Address: 10.1.1.150
at this point i’m at a loss as to what my mistake is.. Please, any tips on what i can do?