Hi all,
We are seeing an issue where our vault master pod is unable to come up.
We debugged the pod logs and found this error:
Error:
core.raft: failed to activate TLS key: error="failed to read raft TLS keyring: context canceled
Ideally the master should be able to join the raft group, or a new leader election process should be triggered.
Please help with following queries:
- How can we avoid this issue in future.
- Please suggest some metric over which we can put an alert to notify that the master pod is unable to join the cluster.
Setup information:
We are using vault free version in HA mode with 3 nodes: 1 master and 2 standby.
Vault version:-
1.12.0
Resource Limits:-
resources:
requests:
memory: 12Gi
cpu: 4000m
limits:
memory: 15Gi
cpu: 8000m
HA config:-
ha:
enabled: true
replicas: 3
raft:
enabled: true
setNodeId: true
config: |
ui = true
disable_cache = true
listener "tcp" {
tls_disable = 1
address = "[::]:8200"
cluster_address = "[::]:8201"
telemetry {
unauthenticated_metrics_access = "true"
}
}
storage "raft" {
path = "/vault/data"
retry_join {
leader_api_addr = "http://vault-prod-0.vault-prod-internal:8200"
}
retry_join {
leader_api_addr = "http://vault-prod-1.vault-prod-internal:8200"
}