Vault in initial phase won't unseal until I restart it once

Hello,

I’m provisioning a vault cluster with a transit-key node. This is how all the cluster nodes are configured:

# Full configuration options can be found at https://www.vaultproject.io/docs/configuration
ui = true
storage "raft" {
        path = "/opt/vault/data"
        retry_join {
        # 'leader_api_addr' means 'address of a possible leader node'
                leader_api_addr = "https://vault-0.node.rsato.internal:8200"
                leader_ca_cert_file = "/opt/vault/tls/ca.crt"
                leader_client_cert_file = "/opt/vault/tls/tls.crt"
                leader_client_key_file = "/opt/vault/tls/tls.key"
        }
        retry_join {
                leader_api_addr = "https://vault-1.node.rsato.internal:8200"
                leader_ca_cert_file = "/opt/vault/tls/ca.crt"
                leader_client_cert_file = "/opt/vault/tls/tls.crt"
                leader_client_key_file = "/opt/vault/tls/tls.key"
        }
        retry_join {
                leader_api_addr = "https://vault-2.node.rsato.internal:8200"
                leader_ca_cert_file = "/opt/vault/tls/ca.crt"
                leader_client_cert_file = "/opt/vault/tls/tls.crt"
                leader_client_key_file = "/opt/vault/tls/tls.key"
        }
}

# HTTPS listener
# The listener stanza may be specified more than once to make Vault listen on multiple interfaces. If you configure multiple listeners you also need to specify api_addr and cluster_addr so Vault will advertise the correct address to other nodes.
listener "tcp" {
  address       = "0.0.0.0:8200"
  tls_cert_file = "/opt/vault/tls/tls.crt"
  tls_key_file  = "/opt/vault/tls/tls.key"
  tls_require_and_verify_client_cert = "true"
  tls_client_ca_file = "/opt/vault/tls/ca.crt"
  tls_min_version = "tls12"
}

# HA parameters
api_addr = "https://vault-2.node.rsato.internal:8200"
cluster_addr = "https://vault-2.node.rsato.internal:8201"

seal "transit" {
  address = "https://vault-transit.node.rsato.internal:8200"
  disable_renewal = "false"
  key_name = "autounseal"
  mount_path = "transit/"
  tls_ca_cert = "/opt/vault/tls/ca.crt"
  tls_client_cert = "/opt/vault/tls/tls.crt"
  tls_client_key = "/opt/vault/tls/tls.key"
  tls_server_name = "vault-transit.node.rsato.internal"
  tls_skip_verify = "false"
  token = "hvs.CAESIAtpi9RDCRxtsw9pZV7Lnlifyje63dl6gsQQFXFA_o2rGh4KHGh2cy5janJvTFJnWmJwc1dzazYyUWVQQkdsU0w"
}

(each with their own hostname)
So the nodes are vault-0, vault-1 and vault-2. I’m initialising vault-0, but this isn’t necessarily the first vm that I’m provisioning through terraform. So they are randomly deployed, at some point vault-0 is initialised, but vault-1 or vault-2 might have been deployed before hand.
So in this case, I keep getting this error message over and over:

Sep 25 22:21:11 vault-2 vault[1005]: 2022-09-25T22:21:11.218+0300 [INFO]  core: stored unseal keys supported, attempting fetch
Sep 25 22:21:11 vault-2 vault[1005]: 2022-09-25T22:21:11.218+0300 [WARN]  failed to unseal core: error="stored unseal keys are supported, but none were found"

But after initialising vault-0, I expect the other nodes to simply work after a while on their own. Instead, vault is stuck in this state and the only way to go around this problem is to restart the service. After I do, it works instantly:

Sep 25 22:37:19 vault-2 vault[3233]: 2022-09-25T22:37:19.826+0300 [INFO]  core: vault is unsealed
Sep 25 22:37:19 vault-2 vault[3233]: 2022-09-25T22:37:19.826+0300 [INFO]  core: unsealed with stored key
Sep 25 22:37:19 vault-2 vault[3233]: 2022-09-25T22:37:19.826+0300 [INFO]  core: entering standby mode

This is rather frustrating, as it prevents automating the vault cluster even with terraform. I’m looking forward to any advice. Except for restarting the service :smiley:

This sounds like a known bug in 1.11.1 and 1.11.2. Upgrade to 1.11.3.

1 Like

Thank you! Your answer is spot-on. Upgrading to 1.11.3 solves the issue instantly.