Internal routing problem

davosian1 · January 13, 2022, 7:10am

When I deploy two services behind a fabio load balancer, I can only reach the one that is deployed on the same node as the IP address it is running on. The moment fabio routes to the other node, I get a gateway timeout. I tested this using the load balancing tutorial at Load Balancing with Fabio | Nomad - HashiCorp Learn.

It looks like something is missing in my configuration since I am getting the same behavior when replacing fabio with traefik like in this tutorial.

My setup is using nomad with consul in HA with two nomad client nodes.

Any idea what I am missing in my setup?

tgross · January 13, 2022, 1:53pm

I would check the IP:port registered for all the instances in Consul, and make sure that those IPs are routable between nodes and also reachable. Ex. if the host has both a public and private IP, did the public IP get registered but there’s a firewall/security group rule that blocks it?

davosian1 · January 13, 2022, 8:57pm

Hi @tgross, you pushed me in the right direction, thank you! Checking Consul, I found that my service instances have registered with the public IPs which are blocked. However, this triggers my next question: how can I tell Consul to pick the internal network address instead (I do not want to use the external IPs)? Is there a config setting for it?

The private network I have set up is 10.0.0.0/16 and my consul as well as my nomad nodes are registering correctly in this network (set up with --retry-join 10.0.0.x).

davosian1 · January 14, 2022, 10:06pm

I did some experimentation with binding addresses (bind_addr) in Consul and Nomad, however, I still have the behavior of my services being registered with public IPs which is probably why I cannot reach them from inside the cluster:

How can I tell nomad or consul to map it to the private IP in the 10.0.0.0/16 network instead? The nodes are using these internal IPs just fine:

Consul servers are configured like this:

datacenter  = "dc1"
data_dir    = "/opt/consul"
bind_addr   = "{{ GetPrivateInterfaces | include \"network\" \"10.0.0.0/16\" | attr \"address\" }}"
client_addr = "0.0.0.0"
retry_join  = ["10.0.0.2", "10.0.0.3", "10.0.0.4"]
ports {
}
addresses {
}
ui               = true
server           = true
bootstrap_expect = 3

This is what I am using for consul clients:

datacenter = "dc1"
data_dir   = "/opt/consul"
bind_addr  = "{{ GetPrivateInterfaces | include \"network\" \"10.0.0.0/16\" | attr \"address\" }}"
retry_join = ["10.0.0.2", "10.0.0.3", "10.0.0.4"]
ports {
}
addresses {
}

Nomad servers have this configuration:

datacenter = "dc1"
data_dir   = "/opt/nomad"
bind_addr   = "{{ GetPrivateInterfaces | include \"network\" \"10.0.0.0/16\" | attr \"address\" }}"
addresses {
  http = "0.0.0.0"
}
server {
  enabled          = true
  bootstrap_expect = 3
}

Nomad clients are configured like this:

datacenter = "dc1"
data_dir   = "/opt/nomad"
bind_addr   = "{{ GetPrivateInterfaces | include \"network\" \"10.0.0.0/16\" | attr \"address\" }}"
addresses {
  http = "0.0.0.0"
}
client {
  enabled = true
}

The servers and clients have these network interfaces configured (slightly obscured with xx):

# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 96:00:01:09:xx:xx brd ff:ff:ff:ff:ff:ff
    inet 49.12.xxx.xxx/32 brd 49.12.xxx.xxx scope global dynamic eth0
       valid_lft 84587sec preferred_lft 84587sec
3: ens10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 86:00:00:01:9a:3f brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.2/32 brd 10.0.0.2 scope global dynamic ens10
       valid_lft 84592sec preferred_lft 84592sec
    inet6 fe80::8400:ff:fe01:9a3f/64 scope link
       valid_lft forever preferred_lft forever
4: zt2lrqabin: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2800 qdisc pfifo_fast state UNKNOWN group default qlen 1000
    link/ether de:10:d0:0f:xx:xx brd ff:ff:ff:ff:ff:ff
    inet 172.26.xxx.xxx/16 brd 172.26.255.255 scope global zt2lrqabin
       valid_lft forever preferred_lft forever

The last one, zt2lrqabin is a ZeroTier VPN network that I am using to interact with the servers remotely which in my opinion should not matter in this context.

The nodes are connected through these routes (no route to the external 49.12.xxx.xxx network, which is what I want for security):

# ip r
default via 172.31.1.1 dev eth0
10.0.0.0/16 via 10.0.0.1 dev ens10
10.0.0.1 dev ens10 scope link
172.26.0.0/16 dev zt2lrqabin proto kernel scope link src 172.26.xxx.xxx
172.31.1.1 dev eth0 scope link

What am I missing?

DerekStrickland · January 18, 2022, 11:12am

Hi @davosian1

Thanks again for taking on this project and being so diligent with it.

I suspect what is happening is that your default network is getting selected, and in your case that just happens to be the one you aren’t interested in. Here are a couple of settings I think you could use to force the network selection you want. I’d start with network_interface. I found this relevant doc snippet in the upgrade guide that I think explains why.
“If you have several routable IPs, it is advised to configure the client’s network interface such that tasks bind to the correct address.”.

network interface - This specifies the name of the interface to force network fingerprinting on for the whole client. When run in dev mode, this defaults to the loopback interface. When not in dev mode, the interface attached to the default route is used. The scheduler chooses from these fingerprinted IP addresses when allocating ports for tasks. This value support go-sockaddr/template format.

So in your case it would be something like

client {
  network_interface = "ens10"
}

host_network - This is used to register additional host networks with the node that can be used when port mapping. Note that there is a corresponding field on the jobspec that I think will also have to be set to hook into this network. If you need/want to bind to multiple host networks, I think this is the way you can achieve that.

Did this get you where you needed to go?

davosian1 · January 18, 2022, 9:24pm

Hi @DerekStrickland, you nailed it! Using network_interface on the client sets up the services with the internal IPs and therefore are reachable within the cluster. This is how my config file for a nomad client looks like now:

datacenter = "dc1"
data_dir   = "/opt/nomad"
advertise {
  http = "{{ GetPrivateInterfaces | include \"network\" \"10.0.0.0/16\" | attr \"address\" }}"
  rpc  = "{{ GetPrivateInterfaces | include \"network\" \"10.0.0.0/16\" | attr \"address\" }}"
  serf = "{{ GetPrivateInterfaces | include \"network\" \"10.0.0.0/16\" | attr \"address\" }}"
}
client {
  enabled = true
  network_interface = "ens10"
}

Thanks a lot for moving my project forward!

DerekStrickland · January 18, 2022, 10:01pm

Awesome!!! I love it when things work

Topic		Replies	Views
Fabio creates routes with container internal IPs for Nomad Job Nomad	3	1087	July 14, 2021
Load balancing in Nomad Nomad	4	763	January 18, 2022
Ingress Type Load Balancing? Nomad	3	608	August 13, 2019
Proxy sidecar unexpectedly advertises localhost via dns Consul connect , nomad	7	337	February 1, 2024
How to load balance with custom domain in Nomad Nomad	1	48	August 19, 2024

Internal routing problem

Related topics