Consul errors when `advertise_addr_wan` is set

jmwilkinson · November 17, 2021, 10:38pm

I’m struggling to understand the behavior of consul in k8s when the config parameter of advertise_addr_wan is set.

I need to communicate to the consul cluster in k8s from other clusters located elsewhere, so I assumed the cluster would need to advertise over WAN using a VIP. I further assumed that setting that config parameter would only apply to servers joining from other clusters over WAN, and not to the servers deployed by helm within the k8s cluster. The docs appear to confirm this:

The advertise WAN address is used to change the address that we advertise to server nodes joining through the WAN

However, as soon as I set that within the helm chart, the consul nodes appear to all attempt to cluster using the WAN address. This may fail, depending on the status of the VIP and various port forwarding- which obviously in undesirable, because a LAN cluster within k8s should stay clustered as long as networking between the k8s nodes is functioning.

As soon as I remove that config option, the nodes cluster as I’d expect and everything is happy:

2021-11-17T22:15:02.768Z [INFO] agent.leader: started routine: routine="CA root pruning"
2021-11-17T22:15:02.772Z [INFO] agent.server: member joined, marking health alive: member=consul-consul-server-2
2021-11-17T22:15:02.779Z [INFO] agent.server: member joined, marking health alive: member=consul-consul-server-0
2021-11-17T22:15:02.847Z [INFO] agent.server: member joined, marking health alive: member=consul-consul-server-1
2021-11-17T22:15:02.854Z [INFO] agent.server: federation state anti-entropy synced
2021-11-17T22:15:03.035Z [INFO] agent: Synced node info

But add it back in, and the nodes appear to be joining over wan and yelling about weird pings…

2021-11-17T22:29:07.181Z [INFO] agent.server.serf.wan: serf: EventMemberJoin: consul-consul-server-1.control 10.238.1.20
2021-11-17T22:29:07.181Z [INFO] agent.server.serf.wan: serf: EventMemberJoin: consul-consul-server-0.control 10.238.1.20

2021-11-17T22:29:14.117Z [INFO] agent: Synced node info
2021-11-17T22:29:15.071Z [WARN] agent.server.memberlist.wan: memberlist: Got ping for unexpected node 'consul-consul-server-0.control' from=10.42.7.0:46994
2021-11-17T22:29:16.969Z [WARN] agent.server.memberlist.wan: memberlist: Got ping for unexpected node 'consul-consul-server-0.control' from=10.42.0.0:59135
2021-11-17T22:29:16.969Z [WARN] agent.server.memberlist.wan: memberlist: Refuting a suspect message (from: consul-consul-server-1.control)
2021-11-17T22:29:18.071Z [WARN] agent.server.memberlist.wan: memberlist: Got ping for unexpected node 'consul-consul-server-0.control' from=10.42.7.0:46994
2021-11-17T22:29:18.072Z [WARN] agent.server.memberlist.wan: memberlist: Got ping for unexpected node consul-consul-server-0.control from=10.42.2.0:15825

What’s up with this? Why is it experiencing WAN problems with the cluster nodes are all LAN? What am I missing?

lkysow · November 17, 2021, 11:34pm

Sorry for the confusion. The servers join each other over both wan and lan so the wan addresses still need to be routable even over the lan.

I agree this seems weird, I can try and find out more info.

jmwilkinson · February 7, 2022, 8:54pm

Just wanted to mention this is still the behavior with consul 1.11.2.

I have spent a small amount of time looking into where these messages come from, I believe the code that handles this is here, or perhaps here. But it’s not obvious to me from either of those how the WAN pool gets configured or why it is populated with LAN nodes.

Topic		Replies	Views
Is setting advertise_addr_wan required when configuring consul cluster to auto-join? Nomad	6	1024	September 21, 2021
Multiple Consul nodes same WAN address Consul	0	323	April 27, 2021
Consul UDP headaches on all public clouds Consul	4	1148	July 13, 2021
Connecting to a private datacenter over a wan from behind my router Consul	5	790	July 1, 2020
The consul client connect to consul server in another network Consul	3	1277	August 10, 2021

Consul errors when `advertise_addr_wan` is set

Related topics