We are attempting to roll out Vault in our production environment, but in our dev phase we are running into trouble getting a cluster up and running. Currently we have one node running and it is going fine, but we need to account for redundancy with a 3-node HA cluster, using raft storage.
After the primary vault was stood up, I created two matching VMs and adjusted the config file for each to use the retry_join
stanza. When I reboot them all and try to unseal the primary vault I get the following error:
“Error looking up token: Error making API request. URL: GET {vault address}/v1/auth/token/lookup-self Code: 500. Errors: * local node not active but active cluster node not found” .
Here is an example of the config, obvious parts redacted.
ui = true
disable_mlock = false
api_addr = "https://{primary vault dns name}:8200"
cluster_addr = "https://{primary vault dns name}:8201"
storage "raft" {
path = "/vault/data"
node_id = "primary_node"
retry_join {
leader_api_addr = "https://{secondary1 dns name}:8200"
leader_ca_cert_file = "/etc/vault/CA.crt"
leader_client_cert_file = "/vault/cert.crt"
leader_client_key_file = "/vault/key.key"
}
retry_join {
leader_api_addr = "https://{secondary2 dns name}:8200"
leader_ca_cert_file = "/vault/CA.crt"
leader_client_cert_file = "/vault/cert.crt"
leader_client_key_file = "/vault/key.key"
}
}
# HTTPS listener
listener "tcp" {
address = "0.0.0.0:8200"
tls_disable = "false"
tls_cert_file = "/vault/cert.crt"
tls_key_file = "/vault/key.key"
telemetry {
unauthenticated_metrics_access = "true"
}
}
telemetry {
prometheus_retention_time = "30s"
disable_hostname = true
}
I have also tried running the vault operator raft join https://{primary dns name}:8200
command which returns joined true
but does not actually join the cluster.
Further, I even tried joining the secondaries from their respective GUIs. This appeared to work for one of the two secondary nodes, since it would show up on the primary when running vault operator raft list-peers
, but the step down command yielded no results.
I’m at a loss for what to try next so any/all responses would be greatly appreciated.