Nomad with Keepalived controlled interfaces

Hi,

i’ve got nomad running at home on top of three raspberry pi. They act nomad server as well as nomad client.

To be able to reach selected services from the internet i do port forwarding from my router to my nomad cluster. Since my router is stupid he can not work with DNS, so for the time being i created one fixed ip to forward incoming traffic to (.210). The nodes themself got other private ips as well (.201, .202, .203). I’m able to move cluster ip (.210) between the nodes with keepalived.

The node that has the cluster ip is running a traefik as ingress controller.

My problem now is with the network_interface setting on the client. I tried host_networks as well but nomad always uses the physical ip to bind to ports and i’m not able to tell nomad client to bind to 0.0.0.0.

With this the traefik for example do not get the http/https traffic and my cluster ip is useless :see_no_evil:

Example config for client


client {
    enabled = true
    node_class  = "local"
    meta {
    (...)
    }
    options = {
   (...)
    }
    # network_interface = "eth0"

    host_network "zerozero" {
      cidr = "0.0.0.0/0"
    }
}
root@homie:~# ifconfig eth0:0
eth0:0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.178.210  netmask 255.255.255.255  broadcast 0.0.0.0
        ether b8:27:eb:2e:b6:49  txqueuelen 1000  (Ethernet)

root@homie:~# docker ps
CONTAINER ID        IMAGE                             COMMAND                  CREATED             STATUS              PORTS                                                                                                                                                                                  NAMES
8a09330fbc19        traefik:v2.2.11                   "/entrypoint.sh trae…"   2 weeks ago         Up 2 weeks          192.168.178.203:80->80/tcp, 192.168.178.203:80->80/udp, 192.168.178.203:443->443/tcp, 192.168.178.203:443->443/udp, 192.168.178.203:28303->8080/tcp, 192.168.178.203:28303->8080/udp   traefik-414f19b1-bd6e-cf44-26fd-3444e167b6b5

It might be a stupid question: But how do i tell nomad to start container that bind ports to 0.0.0.0 ?

Any hint how to get this running will be appreciated :smile:

Bye

Did some additional experiments:

  • rolled out the host_network 0.0.0.0 on all nomad agents
    host_network "zerozero" {
      cidr = "0.0.0.0/0"
    }
  • forced an example task to use this host network:
        resources {
          cpu    = 128
          memory = 64
          network {
            mbits = 10
            port "http" {
              static = 1234
              host_network = "zerozero"
            }
          }
        }
  • result :cry:
root@node-1:~ # docker ps
CONTAINER ID        IMAGE                COMMAND                  CREATED             STATUS              PORTS                                                                    NAMES
4aae3ad14267        nginx:latest         "/docker-entrypoint.…"   28 minutes ago      Up 28 minutes       80/tcp, 192.168.178.201:1234->1234/tcp, 192.168.178.201:1234->1234/udp   nginx-35d49b36-c6b5-d624-2bfd-bf7d47ac32ab
  • run nomad node in debug to gather some of the evidences:
Oct 16 19:21:09 node-1 nomad[14315]:     2020-10-16T19:21:09.932+0200 [DEBUG] client.alloc_migrator: waiting for remote previous alloc to terminate: alloc_id=35d49b36-c6b5-d624-2bfd-bf7d47ac32ab previous_alloc=a00a00b7-2abf-be6a-107b-1f0dc6310fbb
Oct 16 19:21:09 node-1 nomad[14315]:     2020-10-16T19:21:09.975+0200 [DEBUG] client.alloc_runner.task_runner.task_hook.logmon: starting plugin: alloc_id=35d49b36-c6b5-d624-2bfd-bf7d47ac32ab task=nginx path=/usr/bin/nomad args=[/usr/bin/nomad, logmon]
Oct 16 19:21:09 node-1 nomad[14315]:     2020-10-16T19:21:09.987+0200 [DEBUG] client.alloc_runner.task_runner.task_hook.logmon: plugin started: alloc_id=35d49b36-c6b5-d624-2bfd-bf7d47ac32ab task=nginx path=/usr/bin/nomad pid=14537
Oct 16 19:21:09 node-1 nomad[14315]:     2020-10-16T19:21:09.987+0200 [DEBUG] client.alloc_runner.task_runner.task_hook.logmon: waiting for RPC address: alloc_id=35d49b36-c6b5-d624-2bfd-bf7d47ac32ab task=nginx path=/usr/bin/nomad
Oct 16 19:21:10 node-1 nomad[14315]:     2020-10-16T19:21:10.035+0200 [DEBUG] client.alloc_runner.task_runner.task_hook.logmon.nomad: plugin address: alloc_id=35d49b36-c6b5-d624-2bfd-bf7d47ac32ab task=nginx address=/tmp/plugin345046492 network=unix @module=logmon timestam
Oct 16 19:21:10 node-1 nomad[14315]:     2020-10-16T19:21:10.036+0200 [DEBUG] client.alloc_runner.task_runner.task_hook.logmon: using plugin: alloc_id=35d49b36-c6b5-d624-2bfd-bf7d47ac32ab task=nginx version=2
Oct 16 19:21:10 node-1 nomad[14315]:     2020-10-16T19:21:10.052+0200 [INFO]  client.alloc_runner.task_runner.task_hook.logmon.nomad: opening fifo: alloc_id=35d49b36-c6b5-d624-2bfd-bf7d47ac32ab task=nginx @module=logmon path=/var/nomad/node/alloc/35d49b36-c6b5-d624-2bfd-b
Oct 16 19:21:10 node-1 nomad[14315]:     2020-10-16T19:21:10.054+0200 [INFO]  client.alloc_runner.task_runner.task_hook.logmon.nomad: opening fifo: alloc_id=35d49b36-c6b5-d624-2bfd-bf7d47ac32ab task=nginx @module=logmon path=/var/nomad/node/alloc/35d49b36-c6b5-d624-2bfd-b
Oct 16 19:21:10 node-1 nomad[14315]:     2020-10-16T19:21:10.177+0200 [DEBUG] client: updated allocations: index=420178 total=2 pulled=0 filtered=2
Oct 16 19:21:10 node-1 nomad[14315]:     2020-10-16T19:21:10.178+0200 [DEBUG] client: allocation updates: added=0 removed=0 updated=0 ignored=2
Oct 16 19:21:10 node-1 nomad[14315]:     2020-10-16T19:21:10.178+0200 [DEBUG] client: allocation updates applied: added=0 removed=0 updated=0 ignored=2 errors=0
Oct 16 19:21:10 node-1 nomad[14315]:     2020-10-16T19:21:10.387+0200 [DEBUG] client: updated allocations: index=420179 total=2 pulled=0 filtered=2
Oct 16 19:21:10 node-1 nomad[14315]:     2020-10-16T19:21:10.387+0200 [DEBUG] client: allocation updates: added=0 removed=0 updated=0 ignored=2
Oct 16 19:21:10 node-1 nomad[14315]:     2020-10-16T19:21:10.387+0200 [DEBUG] client: allocation updates applied: added=0 removed=0 updated=0 ignored=2 errors=0
Oct 16 19:21:15 node-1 nomad[14315]:     2020-10-16T19:21:15.599+0200 [DEBUG] http: request complete: method=GET path=/v1/agent/health?type=client duration=12.238605ms
Oct 16 19:21:20 node-1 nomad[14315]:     2020-10-16T19:21:20.115+0200 [DEBUG] client.driver_mgr.docker: image pull progress: driver=docker image_name=nginx:latest message="Pulled 3/5 (37.73MiB/45.05MiB) layers: 0 waiting/2 pulling - est 1.9s remaining"
Oct 16 19:21:25 node-1 nomad[14315]:     2020-10-16T19:21:25.603+0200 [DEBUG] http: request complete: method=GET path=/v1/agent/health?type=client duration=766.456µs
Oct 16 19:21:30 node-1 nomad[14315]:     2020-10-16T19:21:30.115+0200 [DEBUG] client.driver_mgr.docker: image pull progress: driver=docker image_name=nginx:latest message="Pulled 5/5 (45.05MiB/45.05MiB) layers: 0 waiting/0 pulling"
Oct 16 19:21:35 node-1 nomad[14315]:     2020-10-16T19:21:35.610+0200 [DEBUG] http: request complete: method=GET path=/v1/agent/health?type=client duration=1.074788ms
Oct 16 19:21:40 node-1 nomad[14315]:     2020-10-16T19:21:40.115+0200 [DEBUG] client.driver_mgr.docker: image pull progress: driver=docker image_name=nginx:latest message="Pulled 5/5 (45.05MiB/45.05MiB) layers: 0 waiting/0 pulling"
Oct 16 19:21:45 node-1 nomad[14315]:     2020-10-16T19:21:45.626+0200 [DEBUG] http: request complete: method=GET path=/v1/agent/health?type=client duration=10.340486ms
Oct 16 19:21:50 node-1 nomad[14315]:     2020-10-16T19:21:50.115+0200 [DEBUG] client.driver_mgr.docker: image pull progress: driver=docker image_name=nginx:latest message="Pulled 5/5 (45.05MiB/45.05MiB) layers: 0 waiting/0 pulling"
Oct 16 19:21:55 node-1 nomad[14315]:     2020-10-16T19:21:55.633+0200 [DEBUG] http: request complete: method=GET path=/v1/agent/health?type=client duration=2.670095ms
Oct 16 19:22:00 node-1 nomad[14315]:     2020-10-16T19:22:00.115+0200 [DEBUG] client.driver_mgr.docker: image pull progress: driver=docker image_name=nginx:latest message="Pulled 5/5 (45.05MiB/45.05MiB) layers: 0 waiting/0 pulling"
Oct 16 19:22:05 node-1 nomad[14315]:     2020-10-16T19:22:05.641+0200 [DEBUG] http: request complete: method=GET path=/v1/agent/health?type=client duration=846.664µs
Oct 16 19:22:10 node-1 nomad[14315]:     2020-10-16T19:22:10.115+0200 [DEBUG] client.driver_mgr.docker: image pull progress: driver=docker image_name=nginx:latest message="Pulled 5/5 (45.05MiB/45.05MiB) layers: 0 waiting/0 pulling"
Oct 16 19:22:10 node-1 nomad[14315]:     2020-10-16T19:22:10.174+0200 [DEBUG] client: updated allocations: index=420181 total=2 pulled=0 filtered=2
Oct 16 19:22:10 node-1 nomad[14315]:     2020-10-16T19:22:10.174+0200 [DEBUG] client: allocation updates: added=0 removed=0 updated=0 ignored=2
Oct 16 19:22:10 node-1 nomad[14315]:     2020-10-16T19:22:10.175+0200 [DEBUG] client: allocation updates applied: added=0 removed=0 updated=0 ignored=2 errors=0
Oct 16 19:22:10 node-1 nomad[14315]:     2020-10-16T19:22:10.825+0200 [DEBUG] client.driver_mgr.docker: docker pull succeeded: driver=docker image_ref=nginx:latest
Oct 16 19:22:10 node-1 nomad[14315]:     2020-10-16T19:22:10.833+0200 [DEBUG] client.driver_mgr.docker: image reference count incremented: driver=docker image_name=nginx:latest image_id=sha256:2ba257cc6e29816671ea9a07da67367eeea8a6769aa2cfd30392304f1d4da095 references=1
Oct 16 19:22:10 node-1 nomad[14315]:     2020-10-16T19:22:10.834+0200 [DEBUG] client.driver_mgr.docker: configured resources: driver=docker task_name=nginx memory=67108864 memory_reservation=0 cpu_shares=128 cpu_quota=0 cpu_period=0
Oct 16 19:22:10 node-1 nomad[14315]:     2020-10-16T19:22:10.834+0200 [DEBUG] client.driver_mgr.docker: binding directories: driver=docker task_name=nginx binds="[]string{"/var/nomad/node/alloc/35d49b36-c6b5-d624-2bfd-bf7d47ac32ab/alloc:/alloc", "/var/nomad/node/alloc/35d
Oct 16 19:22:10 node-1 nomad[14315]:     2020-10-16T19:22:10.834+0200 [DEBUG] client.driver_mgr.docker: networking mode not specified; using default: driver=docker task_name=nginx
Oct 16 19:22:10 node-1 nomad[14315]:     2020-10-16T19:22:10.834+0200 [DEBUG] client.driver_mgr.docker: allocated static port: driver=docker task_name=nginx ip=192.168.178.201 port=1234
Oct 16 19:22:10 node-1 nomad[14315]:     2020-10-16T19:22:10.834+0200 [DEBUG] client.driver_mgr.docker: exposed port: driver=docker task_name=nginx port=1234
Oct 16 19:22:10 node-1 nomad[14315]:     2020-10-16T19:22:10.834+0200 [DEBUG] client.driver_mgr.docker: applied labels on the container: driver=docker task_name=nginx labels=map[com.hashicorp.nomad.alloc_id:35d49b36-c6b5-d624-2bfd-bf7d47ac32ab]
Oct 16 19:22:10 node-1 nomad[14315]:     2020-10-16T19:22:10.835+0200 [DEBUG] client.driver_mgr.docker: setting container name: driver=docker task_name=nginx container_name=nginx-35d49b36-c6b5-d624-2bfd-bf7d47ac32ab
Oct 16 19:22:11 node-1 nomad[14315]:     2020-10-16T19:22:11.471+0200 [DEBUG] client: updated allocations: index=420182 total=2 pulled=0 filtered=2
Oct 16 19:22:11 node-1 nomad[14315]:     2020-10-16T19:22:11.472+0200 [DEBUG] client: allocation updates: added=0 removed=0 updated=0 ignored=2
Oct 16 19:22:11 node-1 nomad[14315]:     2020-10-16T19:22:11.473+0200 [DEBUG] client: allocation updates applied: added=0 removed=0 updated=0 ignored=2 errors=0
Oct 16 19:22:15 node-1 nomad[14315]:     2020-10-16T19:22:15.648+0200 [DEBUG] http: request complete: method=GET path=/v1/agent/health?type=client duration=1.65338ms
Oct 16 19:22:18 node-1 nomad[14315]:     2020-10-16T19:22:18.525+0200 [INFO]  client.driver_mgr.docker: created container: driver=docker container_id=4aae3ad142672fbc1ab1e968ae7c0b19c558255bf72a98eba4262911ddcc2b49
Oct 16 19:22:21 node-1 nomad[14315]:     2020-10-16T19:22:21.693+0200 [INFO]  client.driver_mgr.docker: started container: driver=docker container_id=4aae3ad142672fbc1ab1e968ae7c0b19c558255bf72a98eba4262911ddcc2b49
Oct 16 19:22:21 node-1 nomad[14315]:     2020-10-16T19:22:21.694+0200 [DEBUG] client.driver_mgr.docker.docker_logger: starting plugin: driver=docker path=/usr/bin/nomad args=[/usr/bin/nomad, docker_logger]
Oct 16 19:22:21 node-1 nomad[14315]:     2020-10-16T19:22:21.709+0200 [DEBUG] client.driver_mgr.docker.docker_logger: plugin started: driver=docker path=/usr/bin/nomad pid=15154
Oct 16 19:22:21 node-1 nomad[14315]:     2020-10-16T19:22:21.709+0200 [DEBUG] client.driver_mgr.docker.docker_logger: waiting for RPC address: driver=docker path=/usr/bin/nomad
Oct 16 19:22:21 node-1 nomad[14315]:     2020-10-16T19:22:21.751+0200 [DEBUG] client.driver_mgr.docker.docker_logger.nomad: plugin address: driver=docker @module=docker_logger address=/tmp/plugin577312502 network=unix timestamp=2020-10-16T19:22:21.750+0200
Oct 16 19:22:21 node-1 nomad[14315]:     2020-10-16T19:22:21.751+0200 [DEBUG] client.driver_mgr.docker.docker_logger: using plugin: driver=docker version=2
Oct 16 19:22:21 node-1 nomad[14315]:     2020-10-16T19:22:21.760+0200 [DEBUG] client.driver_mgr.docker.docker_logger.nomad: using client connection initialized from environment: driver=docker @module=docker_logger timestamp=2020-10-16T19:22:21.758+0200
Oct 16 19:22:22 node-1 nomad[14315]:     2020-10-16T19:22:22.023+0200 [DEBUG] client: updated allocations: index=420184 total=2 pulled=0 filtered=2
Oct 16 19:22:22 node-1 nomad[14315]:     2020-10-16T19:22:22.023+0200 [DEBUG] client: allocation updates: added=0 removed=0 updated=0 ignored=2
Oct 16 19:22:22 node-1 nomad[14315]:     2020-10-16T19:22:22.024+0200 [DEBUG] client: allocation updates applied: added=0 removed=0 updated=0 ignored=2 errors=0
Oct 16 19:22:22 node-1 nomad[14315]:     2020-10-16T19:22:22.099+0200 [DEBUG] consul.sync: sync complete: registered_services=1 deregistered_services=0 registered_checks=1 deregistered_checks=0

After some more research it seems that the issue is described at https://github.com/hashicorp/nomad/issues/3675

mad[21769]:     2020-10-16T20:27:33.812+0200 [DEBUG] client.fingerprint_mgr: fingerprinting periodically: fingerprinter=consul period=15s
Oct 16 20:27:40 node-1 nomad[21769]:     2020-10-16T20:27:33.814+0200 [DEBUG] client.fingerprint_mgr.cpu: detected cpu frequency: MHz=1200
Oct 16 20:27:40 node-1 nomad[21769]:     2020-10-16T20:27:33.814+0200 [DEBUG] client.fingerprint_mgr.cpu: detected core count: cores=4
Oct 16 20:27:40 node-1 nomad[21769]:     2020-10-16T20:27:34.114+0200 [DEBUG] client.fingerprint_mgr.network: link speed detected: interface=eth0 mbits=100
Oct 16 20:27:40 node-1 nomad[21769]:     2020-10-16T20:27:34.114+0200 [DEBUG] client.fingerprint_mgr.network: detected interface IP: interface=eth0 IP=192.168.178.201
Oct 16 20:27:40 node-1 nomad[21769]:     2020-10-16T20:27:34.114+0200 [DEBUG] client.fingerprint_mgr.network: detected interface IP: interface=eth0 IP=192.168.178.222

Nomad seems to detect both interfaces both not that one is an alias:

eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.178.201  netmask 255.255.255.0  broadcast 192.168.178.255
        inet6 fe80::ffb8:350b:8e47:201a  prefixlen 64  scopeid 0x20<link>
        ether b8:27:eb:a9:25:bd  txqueuelen 1000  (Ethernet)
        RX packets 1220521461  bytes 139514618021 (129.9 GiB)
        RX errors 0  dropped 8  overruns 0  frame 0
        TX packets 1420222591  bytes 191440300641 (178.2 GiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth0:1234: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.178.222  netmask 255.255.255.0  broadcast 192.168.178.255
        ether b8:27:eb:a9:25:bd  txqueuelen 1000  (Ethernet)