Root certificate rotation ends up in "x509: certificate has expired or is not yet valid"

Hi,

I am using Vault as a CA provider as well as Consul 1.9.3. My mesh is composed of 4 datacenters : primary is on VMs and the 3 others are on k8s, either EKS or bare metal.

Scenario is the following :

I setup my datacenter and they federate well, I can run any service.
After a random time (4 to 6 days) leaving the mesh untouched, I would get the x509: certificate has expired or is not yet valid error.
Looking at the /v1/connect/ca/roots endpoint I can see that the root certificate is indeed expired but I cannot tell why.

I read in the code that it has some kind of watcher on the validity and it should start the rotate process on its own after around 60% of the TTL is expired.

The following can be found in agents/cache-types/connect_leaf_ca.go

//          0                              60%             90%
//   Issued [------------------------------|===============|!!!!!] Expires
// 72h TTL: 0                             ~43h            ~65h
//  1h TTL: 0                              36m             54m

I have very few logs on what is happening even though I run DEBUG log level everywhere.

This log randomly happends on the primary datacenter without any previous error :

Mar  2 16:02:54 ip-10-0-4-156 consul[1297]:     2021-03-02T16:02:54.154Z [ERROR] agent.server.rpc: failed to read byte: conn=from=10.0.3.81:51263 error="tls: failed to verify client certificate: x509: certificate has expired or is not yet valid: current time 2021-03-02T16:02:54Z is after 2021-02-11T11:52:08Z"

I’ve been struggling with this for a while now… Would be happy to provide more detail.

There is a related issue including more detail :

Thanks for your help.

Marius