We’ve got a brand new bare metal Kubernetes 1.19.4 cluster that uses Cilium 1.9.0 as a CNI and MetalLB for load balancing. I’m attempting to mirror the services we’re running on another cluster to this one, starting with Consul. However, I’m running into a strange issue where the readiness check for the consul-server pods is failing if we use https://127.0.0.1:8501 as the endpoint, but not if we use any of the other SANs, e.g. https://consul-server.consul:8501.
I’ve verified that tls.crt in consul-server-cert
contains the appropriate SANs:
DNS:consul-server, DNS:*.consul-server, DNS:*.consul-server.consul, DNS:*.consul-server.consul.svc, DNS:*.server.dc1.consul, DNS:consul-server.consul, DNS:server.dc1.consul, DNS:localhost, IP Address:127.0.0.1
But running the health check command ad hoc shows that the server does not trust the address using the loopback IP:
/consul/tls/ca # curl --cacert /consul/tls/ca/tls.crt https://127.0.0.1:8501/v1/status/leader
curl: (60) SSL certificate problem: unable to get local issuer certificate
More details here: https://curl.haxx.se/docs/sslcerts.html
curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the web page mentioned above.
However, using e.g. consul-server.consul
works:
/consul/tls/ca # curl --cacert /consul/tls/ca/tls.crt https://consul-server.consul:8501/v1/status/leader
"10.0.2.183:8300"
I’ve gone through and completely wiped out the deployment and all associated secrets and persistent data and started from scratch with the same results. The same setup works as expected on our existing cluster; I just updated it to 1.9.0 to verify that it was not something that broke between chart or application versions.
Please let me know if you need any additional information. Thanks in advance.