I am trying to setup a basic nomad consul integration. My limited understanding of what Consul provides in that scenario is that Nomad in production should run the workloads on localhost, and then the proxy sidecars will do the communication over the public network? My expectation would be that service A on host A will talk to it’s local proxy, which will ask consul for the machine service B runs on, get back the public IP for host B, and when talking to the proxy on machine B, that traffic will be routed through envoy on host B to service B on that machine?
This internal model seems to correlate with what I am seeing when running Nomad on my public network, where my services can connect to one another, and I can control whether they are allowed to communicate with a consul intention, which seems to eliminate them talking directly to one another.
When I change the nomad interface to lo, though, my frontend service can no longer talk to my backend, it gets an envoy 111 error, which after some googling seems to indicate there is nothing running on the ip:port that is trying to be reached. When doing some basic DNS queries to consul, I get back the bind address inside of nomad, which in the case of 127.0.0.1 would, is my speculation, mean that service A connects to the proxy, which looks for what to talk to, finds 127.0.0.1 as the advertised address, talks to the internal 127.0.0.1 in the sidecar, which has no service running on it, and gets the 111 error.
This leads to a couple of questions: First, is this understanding of the behavior of consul service mesh correct? This question is mostly for my edification, but secondly, the more important question: Should the consul query for my backend service provide the public IP rather than 127.0.0.1? And if so, what do I do in order to make sure that it advertises correctly even when running the workload itself on localhost? If my guess is wrong and it should advertise localhost, what is the lookup I should use in order to figure out which machine is running it?