Error unsealing: context deadline exceeded

Hello,

We are trying to setup a vault cluster, 3 vault VM, with HA and with raft storage, but we are facing this error:
“Error unsealing: context deadline exceeded” in the second vault VM/third vault VM.

We are able to:
init in the first node;
unseal successfully with 3 keys,

But when we try to unseal in the second/thrid node with this command:
sudo vault operator unseal

We are able to unseal with 2 keys, but when we try to unseal with the third key,
It will hold for a while and then prompt this in red color : Error unsealing: context deadline exceeded

What can be the problem?

THanks
Daniel

Hello,

We have this as settings:

ui= true

disable_cache           = true
disable_mlock           = true
#mlock = true

# HTTPS listener

storage "raft" {
  path    = "/opt/vault/data"
  node_id = "D-VAULT1"
  retry_join {
      leader_api_addr   = "https://vault1.rc.local:8200"
   }
  retry_join {
      leader_api_addr   = "https://vault2.rc.local:8200"
   }
   retry_join {
      leader_api_addr   = "https://vault3.rc.local:8200"
   }
}


# HTTPS listener
listener "tcp" {
  address       = "0.0.0.0:8200"
  cluster_addr  = "0.0.0.0:8201"
  tls_disable = "false"
  tls_client_ca_file   = "/opt/vault/tls/ca.pem"
  tls_cert_file = "/opt/vault/tls/tls.crt"
  tls_key_file  = "/opt/vault/tls/tls.key"
}

cluster_addr            = "https://vault1.rc.local:8201"
api_addr                = "https://vault.rc.local:8200"
disable_mlock       = true
max_lease_ttl           = "240h"
default_lease_ttl       = "240h"
cluster_name            = "vault"
raw_storage_endpoint    = true
disable_sealwrap        = true
disable_printable_check = true

vault.rc.local = point to a VIP with keepalived between this 3 vault server
```​

Thanks
Daniel

Please see Welcome to the forum - please reformat your message and fix the formatting of your configuration, as detailed in that message?

This is just a complex way of saying that the Vault server failed to respond in a reasonable amount of time. It gives no clue as to why that is the case.

To find out more, you would have to look at the log messages printed by the Vault server to stdout/stderr.

Hello Maxb

I am not sure if this is the correct way to debug or look at the log messages

daniel@D-VAULT2:/root$ journalctl -f -u vault
May 19 23:36:14 D-VAULT2 vault[770]:
May 19 23:36:14 D-VAULT2 vault[770]: 2023-05-19T23:36:14.880+0800 [ERROR] core: failed to retry join raft cluster: retry=2s
May 19 23:36:14 D-VAULT2 vault[770]:   err=
May 19 23:36:14 D-VAULT2 vault[770]:   | failed to send answer to raft leader node: Error making API request.
May 19 23:36:14 D-VAULT2 vault[770]:   |
May 19 23:36:14 D-VAULT2 vault[770]:   | URL: PUT https://vault1.rc.local:8200/v1/sys/storage/raft/bootstrap/answer
May 19 23:36:14 D-VAULT2 vault[770]:   | Code: 500. Errors:
May 19 23:36:14 D-VAULT2 vault[770]:   |
May 19 23:36:14 D-VAULT2 vault[770]:   | * Preventing server addition that would require removal of too many servers and cause cluster instability
May 19 23:36:14 D-VAULT2 vault[770]:
May 19 23:36:16 D-VAULT2 vault[770]: 2023-05-19T23:36:16.575+0800 [ERROR] core: failed to get raft challenge: leader_addr=https://vault2.rc.local:8200 error="error dur                                                                      ing raft bootstrap init call: context deadline exceeded"
May 19 23:36:16 D-VAULT2 vault[770]: 2023-05-19T23:36:16.881+0800 [INFO]  core: security barrier not initialized
May 19 23:36:16 D-VAULT2 vault[770]: 2023-05-19T23:36:16.882+0800 [INFO]  core: attempting to join possible raft leader node: leader_addr=https://vault1.rc.local:8200
May 19 23:36:16 D-VAULT2 vault[770]: 2023-05-19T23:36:16.882+0800 [INFO]  core: attempting to join possible raft leader node: leader_addr=https://vault2.rc.local:8200
May 19 23:36:16 D-VAULT2 vault[770]: 2023-05-19T23:36:16.882+0800 [INFO]  core: attempting to join possible raft leader node: leader_addr=https://vault3.rc.local:8200
May 19 23:36:16 D-VAULT2 vault[770]: 2023-05-19T23:36:16.889+0800 [ERROR] core: failed to get raft challenge: leader_addr=https://vault3.rc.local:8200
May 19 23:36:16 D-VAULT2 vault[770]:   error=
May 19 23:36:16 D-VAULT2 vault[770]:   | error during raft bootstrap init call: Error making API request.
May 19 23:36:16 D-VAULT2 vault[770]:   |
May 19 23:36:16 D-VAULT2 vault[770]:   | URL: PUT https://vault3.rc.local:8200/v1/sys/storage/raft/bootstrap/challenge
May 19 23:36:16 D-VAULT2 vault[770]:   | Code: 503. Errors:
May 19 23:36:16 D-VAULT2 vault[770]:   |
May 19 23:36:16 D-VAULT2 vault[770]:   | * Vault is sealed
May 19 23:36:16 D-VAULT2 vault[770]:
May 19 23:36:16 D-VAULT2 vault[770]: 2023-05-19T23:36:16.891+0800 [ERROR] core: failed to retry join raft cluster: retry=2s
May 19 23:36:16 D-VAULT2 vault[770]:   err=
May 19 23:36:16 D-VAULT2 vault[770]:   | failed to send answer to raft leader node: Error making API request.
May 19 23:36:16 D-VAULT2 vault[770]:   |
May 19 23:36:16 D-VAULT2 vault[770]:   | URL: PUT https://vault1.rc.local:8200/v1/sys/storage/raft/bootstrap/answer
May 19 23:36:16 D-VAULT2 vault[770]:   | Code: 500. Errors:
May 19 23:36:16 D-VAULT2 vault[770]:   |
May 19 23:36:16 D-VAULT2 vault[770]:   | * Preventing server addition that would require removal of too many servers and cause cluster instability

Thanks
Daniel

That sounds like it gets us a lot closer to the cause…

Is it possible that you have:

or

set to the same value on multiple nodes? I think that might break things in this way.

Hello maxb,

Thanks for the hint, yes, I have 3 node have the same cluster_addr by mistake. After corrected it, everything works fine now. Thanks again!

Daniel