Meaning of "Failed to resolve consul-consul-server-0.consul-consul-server.consul.svc"

Hello,

I would need your assistance in figuring out what this error message means “Failed to resolve consul-consul-server-0.consul-consul-server.consul.svc”.
Here is snippet from the error in the logs:
2020-12-18T09:39:53.891Z [WARN] agent: Join cluster failed, will retry: cluster=LAN retry_interval=30s error=
2020-12-18T09:40:23.891Z [INFO] agent: (LAN) joining: lan_addresses=[consul-consul-server-0.consul-consul-server.consul.svc, consul-consul-server-1.consul-consul-server.consul.svc, consul-consul-server-2.consul-consul-server.consul.svc]
2020-12-18T09:40:24.083Z [WARN] agent.server.memberlist.lan: memberlist: Failed to resolve consul-consul-server-0.consul-consul-server.consul.svc: lookup consul-consul-server-0.consul-consul-server.consul.svc on 169.53.123.12:53: no such host
2020-12-18T09:40:24.310Z [WARN] agent.server.memberlist.lan: memberlist: Failed to resolve consul-consul-server-1.consul-consul-server.consul.svc: lookup consul-consul-server-1.consul-consul-server.consul.svc on 169.53.123.12:53: no such host
2020-12-18T09:40:24.512Z [WARN] agent.server.memberlist.lan: memberlist: Failed to resolve consul-consul-server-2.consul-consul-server.consul.svc: lookup consul-consul-server-2.consul-consul-server.consul.svc on 169.53.123.12:53: no such host
2020-12-18T09:40:24.512Z [WARN] agent: (LAN) couldn’t join: number_of_nodes=0 error="3 errors occurred:
* Failed to resolve consul-consul-server-0.consul-consul-server.consul.svc: lookup consul-consul-server-0.consul-consul-server.consul.svc on 169.53.123.12:53: no such host
* Failed to resolve consul-consul-server-1.consul-consul-server.consul.svc: lookup consul-consul-server-1.consul-consul-server.consul.svc on 169.53.123.12:53: no such host
* Failed to resolve consul-consul-server-2.consul-consul-server.consul.svc: lookup consul-consul-server-2.consul-consul-server.consul.svc on 169.53.123.12:53: no such host

We have 3 Consul servers running in Azure AKS and 6 consul clients running on VMs.
In the Consul Dashboard, i can see all of them(the 3 servers running in AKS and the 6 clients running on the VMs).

I must point out that everything works fine, therefore i was wondering whether this is a serious error message. If it is, could you please provide some instruction on how it can be fixed?
If you need any additional information, please let me know.

Thanks in advance for your time,
BR
Mike

Hi @mike-miller-ct,

This error is indicating that the pod is unable to resolve the DNS hostname consul-consul-server-0.consul-consul-server.consul.svc from the configured DNS resolver.

Are these errors being generated from the clients? If so, are the clients properly configured to be able to resolve those hostnames from the DNS provider in Kubernetes?

Hi blake,

the error is happening in the servers runninf in AKS. The VMs on which I have Consul running, till now, they havent caused any issues. When I run the command consul members, irrespectively if i do it by connecting to the containers in AKS or to the Consul clients running on the VMs, all the servers and clients are being properly shown.

Hi @mike-miller-ct if you do see this happening often, we would need to get your values.yml config to understand how this might be happening. Thank you.

Nvm I see you got the issue resolved here after Blake had mentioned he worked with you: Consul Server unable to join Consul Cluster - #3 by mike-miller-ct

It is quite strange that DNS does not seem to be working on AKS nodes, I have seen this happen once or twice before when using Azure CNI.