Hello!
Before I open an issue at github I want make sure, that there is no mistake in my configuration.
I want use vault inside k8s, I use an own CA, client certificates are created by cert-manager. This works well.
tls.crt: signed client certificate for vault-0/1/2
tls.key: private client certificate for vault-0/1/2
ca.crt: public cert of my own CA
Description of setup.
Statefulset with 3 replicas, tls.crt, tls.key and ca.crt are injected by secret.
In the end, vault server is running with follow config
disable_mlock = true
ui = true
listener "tcp" {
tls_disable = 0
address = "[::]:8200"
tls_cert_file = "/vault/ssl/tls.crt"
tls_key_file = "/vault/ssl/tls.key"
tls_client_ca_file = "/vault/ssl/ca.crt"
tls_min_version = "tls12"
tls_disable_client_certs = true
cluster_address = "[::]:8201"
}
storage "raft" {
path = "/vault/data"
}
service_registration "kubernetes" {}
seal "awskms" {
kms_key_id = "XYZ"
region = "eu-west-1"
}
k8s service registration and all the other k8s parts work well too.
So now, I have 3 pods (vault-0, vault-1, vault-2) and I want build a ha cluster with tls.
- init the vault on vault-0
kubectl exec -i -t -n vault vault-0 -c vault -- sh -c "clear; sh"
/ $ vault operator init
# a lot of output
Success! Vault is initialized
/ $ vault status
Key Value
--- -----
Recovery Seal Type shamir
Initialized true
Sealed false
Total Recovery Shares 5
Threshold 3
Version 1.9.2
Storage Type raft
Cluster Name vault-cluster-a3177319
Cluster ID 0a8b0c83-6979-e06d-ac0a-e9a227356eeb
HA Enabled true
HA Cluster https://vault-0.vault-internal:8201
HA Mode active
Active Since 2022-04-08T12:29:38.03703838Z
Raft Committed Index 31
Raft Applied Index 31
Now let join vault-1 to cluster
kubectl exec -i -t -n vault vault-0 -c vault -- sh -c "clear; sh"
/ $ vault operator raft join https://vault-0.vault-internal:8200
Error joining the node to the Raft cluster: Error making API request.
URL: POST https://vault-1.vault-internal:8200/v1/sys/storage/raft/join
Code: 500. Errors:
* failed to join raft cluster: failed to join any raft leader node
Logfiles:
vault-1:vault 2022-04-08T12:33:51.904Z [INFO] core: attempting to join possible raft leader node: leader_addr=https://vault-0.vault-internal:8200
vault-0:vault 2022-04-08T12:33:51.915Z [INFO] http: TLS handshake error from 10.207.11.240:51218: remote error: tls: bad certificate
vault-1:vault 2022-04-08T12:33:51.915Z [WARN] core: join attempt failed: error="error during raft bootstrap init call: Put \"https://vault-0.vault-internal:8200/v1/sys/storage/raft/bootstrap/challenge\": x509: certificate signed by unknown authority"
vault-1:vault 2022-04-08T12:33:51.915Z [ERROR] core: failed to join raft cluster: error="failed to join any raft leader node"
I get the same result, if I use follow command on vault-1.
vault operator raft join -leader-ca-cert="/vault/ssl/ca.crt" https://vault-0.vault-internal:8200
Workaround, with follow patch works, but it is not useable in production.
Unix systems uses CA files under /etc/ssl/certs to verify ssl client certificates. I changed /etc/ssl/certs/ca-certificates.crt from vault server pod so, that this file contains original content plus my own CA (it is really a bad patch, because /etc/ssl/certs is readonly and it it not possible to add any file into this pod folder, I had to “overmount” it)
volumeMounts:
- name: certificates-etc
mountPath: /etc/ssl/certs
readOnly: true
With this change vault server on vault-0 was able to verify certificate from vault-1 and vault-1 was able to connect via https to vault-0 to join the cluster. I made in this patched setup same steps like above.
What I miss: How can I configure vault server, that this service uses my own ca.crt file to verify client certificates instead of checking only CA from /etc/ssl/certs/ca-certificates.crt?
best regards,
thomas.