Fresh Consul deployment fails readiness check for 127.0.0.1, but not for DNS name

samling · December 14, 2020, 11:36pm

We’ve got a brand new bare metal Kubernetes 1.19.4 cluster that uses Cilium 1.9.0 as a CNI and MetalLB for load balancing. I’m attempting to mirror the services we’re running on another cluster to this one, starting with Consul. However, I’m running into a strange issue where the readiness check for the consul-server pods is failing if we use https://127.0.0.1:8501 as the endpoint, but not if we use any of the other SANs, e.g. https://consul-server.consul:8501.

I’ve verified that tls.crt in consul-server-cert contains the appropriate SANs:

DNS:consul-server, DNS:*.consul-server, DNS:*.consul-server.consul, DNS:*.consul-server.consul.svc, DNS:*.server.dc1.consul, DNS:consul-server.consul, DNS:server.dc1.consul, DNS:localhost, IP Address:127.0.0.1

But running the health check command ad hoc shows that the server does not trust the address using the loopback IP:

/consul/tls/ca # curl --cacert /consul/tls/ca/tls.crt https://127.0.0.1:8501/v1/status/leader
curl: (60) SSL certificate problem: unable to get local issuer certificate
More details here: https://curl.haxx.se/docs/sslcerts.html

curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the web page mentioned above.

However, using e.g. consul-server.consul works:

/consul/tls/ca # curl --cacert /consul/tls/ca/tls.crt https://consul-server.consul:8501/v1/status/leader
"10.0.2.183:8300"

I’ve gone through and completely wiped out the deployment and all associated secrets and persistent data and started from scratch with the same results. The same setup works as expected on our existing cluster; I just updated it to 1.9.0 to verify that it was not something that broke between chart or application versions.

Please let me know if you need any additional information. Thanks in advance.

tde · October 7, 2021, 1:52pm

Annoyingly I’m getting the same exact issue…

X509v3 Subject Alternative Name:
DNS:consul-server, DNS:.consul-server, DNS:.consul-server.consul, DNS:.consul-> server.consul.svc, DNS:.server.mesh-c1., DNS:server.mesh-c1.,
DNS:localhost, IP Address:127.0.0.1

And indeed success on DNS name

/ $ curl --cacert /consul/tls/ca/tls.crt https://consul-server:8501/v1/status/leader
“10.233.65.36:8300”

But failure on IP

/ $ curl --cacert /consul/tls/ca/tls.crt https://127.0.0.1:8501/v1/status/leader
curl: (60) SSL certificate problem: unable to get local issuer certificate
More details here: curl - SSL CA Certificates
…

For reference here’s the relevant values.yml used:

  tls:
    enabled: true
    enableAutoEncrypt: true
    verify: true
    httpsOnly: true

relying on autogeneration of certificates at startup, which do seem to work just fine given the content of the server CA

Topic		Replies	Views
Consul verified TLS from Pods in Kubernetes cluster Consul k8s	12	4058	May 29, 2020
Enable TLS on kubernetes Consul k8s	3	749	February 4, 2021
Consul TLS verification from k8s cluster to Consul Server Consul k8s	9	3216	September 1, 2020
Install consul by helm chart - Readiness probe failed Consul	1	202	April 12, 2024
Failed to resolve consul-server-0.consul-server: lookup consul-server-0.consul-server on 30.43.0.10:53: no such host Consul	1	1768	April 19, 2022

Fresh Consul deployment fails readiness check for 127.0.0.1, but not for DNS name

Related topics