I have two machines I’d like to connect as a Nomad cluster.
- Both machines are connected to a Tailscale VPN. I would like to use the Tailscale interface for all interactions between the nodes, because I run services on them that I want to be accessible remotely via the VPN.
- Both machines are running Consul, and have discovered each other. Machine A is configured with server=true and bootstrap=true, Machine B is not. As far as I can tell, Consul is working just fine.
- Both nodes are running Nomad. Machine A is configured as both a server and a client, and Machine B is configured only as a client.
From what I’ve read this should be enough for the Nomad instances to discover each other and connect in a cluster with one server and two clients (with one machine acting in both roles).
However, when Machine B’s Nomad instance tries to connect to Machine A, I see:
client.server_mgr: no servers available
client: registration waiting on servers
client.consul: bootstrap contacting Consul DCs: consul_dcs=["dc1"]
client: error discovering nomad servers:
error=
| 1 error occurred:
| * address 192.168.1.129: missing port in address
The interesting thing here is that that IP address is the correct LAN address for machine A, not the address for that machine on the Tailscale VPN. So the right machine is being discovered via Consul, but the wrong address is being used to try to connect to it.
In the Consul UI, all addresses shown are in the Tailscale VPN IP range.
Does anyone know why Nomad might use a different address for a machine than the one advertised by Consul?
To be clear - both nodes are configured via NixOS with:
bind_addr = "{{ GetInterfaceIP \"tailscale0\" }}";
advertise = {
http = "{{ GetInterfaceIP \"tailscale0\" }}";
rpc = "{{ GetInterfaceIP \"tailscale0\" }}";
serf = "{{ GetInterfaceIP \"tailscale0\" }}";
};
addresses = {
http = "{{ GetInterfaceIP \"tailscale0\" }}";
rpc = "{{ GetInterfaceIP \"tailscale0\" }}";
serf = "{{ GetInterfaceIP \"tailscale0\" }}";
}