Hi,
We are running a three Windows servers in a cluster with Consul 1.13.1. In this environments, Consul unexpectedly stopped working. The only way we managed to get it back online was by restoring files from a previous backup. We suspect that the data folder may have been corrupted and that this issue propagated to the other Consul nodes in the cluster.
From what we can tell, there were no issues with power, memory, or CPU at the time of the failure.
When a client tries to connect, they encounter the following error:
Orbit.Consul.Exceptions.ConsulCommunicationException: Couldn't reach consul
---> Consul.ConsulRequestException: Unexpected response, status code InternalServerError:
rpc error getting client: failed to get conn: x509: certificate has expired or is not yet valid:
The error suggests that it could be certificate-related, but all certificates are up to date. After restoring the Consul files from the backup, everything worked again.
We’ve enabled logging at the info level to gather more data in case this happens again. Has anyone experienced a similar issue? Any advice or solutions would be greatly appreciated.
Best regards,