Vault does not response after unsealling

I have 2 HA vault servers with 5 Consul storage servers. Recently I got strange issue with Vault, as it does responses when I try to login and it gives me timeout error.

I restarted vault service and unsealed it and got the following info:

vault[4610]: 2022-02-06T10:25:59.485+0300 [INFO] core: successfully enabled credential backend: type=ldap path=ldap/
Feb 06 10:25:59 vault-1 vault[4610]: 2022-02-06T10:25:59.485+0300 [INFO] core: successfully enabled credential backend: type=approle path=approle/
Feb 06 10:25:59 vault-1 vault[4610]: 2022-02-06T10:25:59.485+0300 [INFO] core: successfully enabled credential backend: type=cert path=cert/
Feb 06 10:25:59 vault-1 vault[4610]: 2022-02-06T10:25:59.489+0300 [INFO] core: restoring leases
Feb 06 10:25:59 vault-1 vault[4610]: 2022-02-06T10:25:59.489+0300 [INFO] rollback: starting rollback manager
Feb 06 10:25:59 vault-1 vault[4610]: 2022-02-06T10:25:59.505+0300 [INFO] expiration: lease restore complete
Feb 06 10:25:59 vault-1 vault[4610]: 2022-02-06T10:25:59.506+0300 [INFO] identity: entities restored
Feb 06 10:25:59 vault-1 vault[4610]: 2022-02-06T10:25:59.508+0300 [INFO] identity: groups restored
Feb 06 10:25:59 vault-1 vault[4610]: 2022-02-06T10:25:59.511+0300 [INFO] core: usage gauge collection is disabled
Feb 06 10:25:59 vault-1 vault[4610]: 2022-02-06T10:25:59.516+0300 [INFO] core: post-unseal setup complete

Any help please!

Not enough info here, please include your config, startup, commands and errors you’re seeing/using. Also you may want to change the config to do debug logging and include that as well.

1 Like

actually this issue came after I upgraded vault to v1.8.5, two months after the upgrading.
sometimes the system works normally, then suddenly get stuck because of this issue and keeps like this for a while then comes back to work fine.

when the system enters the stuck state it shows some information in stream logs like the following:
Feb 06 11:07:03 vault-1 vault[4610]: 2022-02-06T11:07:03.381+0300 [INFO] expiration: revoked lease: lease_id=auth/approle/login/h94ed8bbb811fe1526d07556f7cd53f4f0086053ae90b9305dcd17c984a4e0f7c
Feb 06 12:44:02 vault-1 vault[4610]: 2022-02-06T12:44:02.182+0300 [INFO] expiration: revoked lease: lease_id=auth/ldap/login/ahmed/h401a12097cac066c76e29a81f351207c0375ddbd39db1aa018c6ff12841340b0
Feb 06 12:45:06 vault-1 vault[4610]: 2022-02-06T12:45:06.579+0300 [INFO] expiration: revoked lease: lease_id=auth/ldap/login/ali/h8ac214a770d79ee9fcd985c2b275e88d9442095f43537d3077d0fd0e0e9d53d9
Feb 06 12:54:48 vault-1 vault[4610]: 2022-02-06T12:54:48.266+0300 [INFO] expiration: revoked lease: lease_id=auth/ldap/login/ali/hec20c94acbeb5da59b7b6862b7363f5ab8ce3657021308eb41d5c9d470fb8724

the config file is:
backend “consul” {
address = “127.0.0.1:8500”

}

listener “tcp” {
address = “hostname:8200”
tls_disable = 0
tls_cert_file = “/etc/vault/ssl/vault-crt.crt”
tls_key_file = “/etc/vault/ssl/vault-key.key”
}

api_addr = “https://loadbalancer-ip:8200”
cluster_addr = “https://loadbalancer-ip:8201”

The api_addr needs to be your host’s IP:8200, not your VIP on all of your nodes – that’s how the nodes talk to each other. You’re missing ui = true if you’re using the ui. I’d suggest adding a log_level = debug at the bottom.

It’s possible that your consul servers are over loaded or losing raft. I would turn on the debug logging there as well to see if you can catch the issue while it’s happening.

1 Like

The IP address is the IP for the host itself or the other side HA member’s IP address

Thank you for your support and cooperation