How can I get other host networks recognised on existing nodes?

Hi there.

I’m trying to understand why some clients in a cluster, which have host_networks, are not having their host networks recognised, when I try to schedule jobs on on them.

My situation

I have a small cluster of machines, which are all connected on a private network, as well as having public IP addresses.

My network stanza

Every machine has this network stanza, in /etc/nomad.d/config/networks.hcl.

client {
  host_network "hetzner-1" {
    interface = "ens10"
    cidr      = "10.0.0.0/24"
  }
}

This is merged into a config file at /etc/nomad.d/nomad.hcl. It looks like this:

# Full configuration options can be found at https://www.nomadproject.io/docs/configuration

data_dir  = "/opt/nomad/data"
bind_addr = "10.0.0.4"

advertise {
  http = "10.0.0.4"
  rpc  = "10.0.0.4"
  serf = "10.0.0.4"
}


client {
  enabled = true

  # https://www.nomadproject.io/docs/configuration/server_join
  server_join {
    retry_join = ["10.0.0.2"]
  }
}

telemetry {
  publish_allocation_metrics = true
  publish_node_metrics       = true
  prometheus_metrics         = true
}

log_level = "INFO"

I would expect the merging to mean that the host network is picked up, but only one machine seems to have the network recognised.

Here’s the have a system job which I want to run on every node, but where most of my nodes are being filtered because they’re assumed not to have the host network defined in network stanza above.

My job file looks like this:

job "node_exporter" {
  datacenters = ["dc1"]

  type = "system"

  group "node_exporter" {

    count = 1

    network {
      mode = "host"
      port "node_exporter" {
        static       = 9100
        host_network = "hetzner-1"
      }
    }

    task "node_exporter" {
      driver = "docker"

      config {
        image = "prom/node-exporter:v1.4.0"
        ports = ["node_exporter"]
        args = [
          "--web.listen-address", ":${NOMAD_PORT_node_exporter}",
        ]
      }

      resources {
        cpu    = 100 # 100 MHz
        memory = 64  # 64MB
      }
    }
  }
}

Becuase every client node has the host_network stanza, I would expect this system job to run a node_exporter on every client.

However, I get this response:

nomad plan ./nomad/02-node-exporter.hcl
Job: "node_exporter"
Task Group: "node_exporter" (1 in-place update)
  Task: "node_exporter"

Scheduler dry-run:
- WARNING: Failed to place allocations on all nodes.
  Task Group "node_exporter" (failed to place 1 allocation):
    * Class "worker": 1 nodes excluded by filter
    * Constraint "missing host network \"hetzner-1\" for port \"node_exporter\"": 3 nodes excluded by filter

Job Modify Index: 72860

Every node apart from the server node is being filtered out, because the host network is assumed to be missing.

How do I get my host network recognised, so nodes are not filtered out unnecessarily?

brief update.

I thought maybe the issue was that once fingerprinting happens, there is no way to change a node.

That doesn’t appear to be the case.

Deleting any client data in the /opt/nomad/data , the rejoining seems to have allowed me to join fresh, but the host networks are still not detected in the client stanza on the client nondes.

This is frustrating, because I literally am using the same way to declare the host_network on the server node that is working!

Yeah, I’m stumped on this.

Slight update. I think deleting the local state might be the only way to get new host networks detected after all. Is this intended behaviour?

Here’s the output from the CLI when using the handy nomad node status -verbose command as outlined here in PR 11432.

I’m also using just, to set environment variables like NOMAD_ADDR, and NOMAD_TOKEN as a convenience.

monitoring

When this was set up, I had the host network.

# just nomad node status -verbose -json ba1cedee  | jq  '.HostNetworks'
{
  "hetzner-1": {
    "CIDR": "10.0.0.0/24",
    "Interface": "ens10",
    "Name": "hetzner-1",
    "ReservedPorts": ""
  }
}

app2

I didn’t destroy the local state dir one this one.

# just nomad node status -verbose -json fcf9f77f  | jq  '.HostNetworks'

null

app1

I did destroy the local state dir one this one. It shows up after I it rejoined, and I deleted the previous fingerprinted record of the machine. that I assume the server had recorded.

# just nomad node status -verbose -json 01d1a0d8  | jq  '.HostNetworks'
{
  "hetzner-1": {
    "CIDR": "10.0.0.0/24",
    "Interface": "ens10",
    "Name": "hetzner-1",
    "ReservedPorts": ""
  }
}

db

This one I didn’t clear out the state either. No sign of the host network.

# just nomad node status -verbose -json a2915cd0  | jq  '.HostNetworks'

null

Based on this. I would expect the system job above, which only looks for this hostnetwork, to show up both monitoring (ba1cedee) and app1 (01d1a0d8) as eligible for running, and filter out both app2 (fcf9f77f) and db (a2915cd0).

Hmm… that wasn’t it after all.

Planning shows the other machines being filtered out. Here’s the job file

job "node_exporter" {
  datacenters = ["dc1"]

  type = "system"

  group "node_exporter" {

    count = 1

    network {
      mode = "host"
      port "node_exporter" {
        static       = 9100
        host_network = "hetzner-1"
      }
    }

    task "node_exporter" {
      driver = "docker"

      config {
        image = "prom/node-exporter:v1.4.0"
        // we expose the port
        ports = ["node_exporter"]
        // but pass in just the port to docker, as trying to listen on the
        // statsd_exporter or statsd_exporter_udp internal IPs will fail
        args = [
          "--web.listen-address", ":${NOMAD_PORT_node_exporter}",
        ]
      }

      resources {
        cpu    = 100 # 100 MHz
        memory = 64  # 64MB
      }
    }
  }
}

And here’s the result:

# just nomad plan ./nomad/02-node-exporter.hcl
Job: "node_exporter"
Task Group: "node_exporter" (1 in-place update)
  Task: "node_exporter"

Scheduler dry-run:
- WARNING: Failed to place allocations on all nodes.
  Task Group "node_exporter" (failed to place 1 allocation):
    * Class "worker": 1 nodes excluded by filter
    * Constraint "missing host network \"hetzner-1\" for port \"node_exporter\"": 3 nodes excluded by filter

Job Modify Index: 72860
To submit the job with version verification run:

nomad job run -check-index 72860 ./nomad/02-node-exporter.hcl

When running the job with the check-index flag, the job will only be run if the
job modify index given matches the server-side version. If the index has
changed, another user has modified the job and the plan's results are
potentially invalid.

This makes no sense to me. A system should run on all nodes, right? The filtered out nodes are showing as “empty client”. What does that mean?

I see it mentioned in the docs about topology visualisation , but beyond that, the significance is less clear to me.

Sigh, this is really frustrating. I’ve tried spinning up a totally fresh new VM, and a carefully written new nomad.hcl file at /etc/nomad.hcl, and I still hit a brick wall.

For the avoidance of doubt, here is my nomad.hcl

# Full configuration options can be found at https://www.nomadproject.io/docs/configuration

data_dir  = "/opt/nomad/data"
bind_addr = "10.0.0.6"

advertise {
  http = "10.0.0.6"
  rpc  = "10.0.0.6"
  serf = "10.0.0.6"
}


client {
  enabled = true

  # https://www.nomadproject.io/docs/configuration/server_join
  server_join {
    retry_join = ["10.0.0.2"]
  }

  host_network "hetzner-1" {
    interface = "ens10"
    cidr      = "10.0.0.0/24"
  }

    host_volume "persistent_data" {
    path = "/var/lib/nomad_persistent_data"
    read_only = false
  }


  node_class = "worker"
  meta {
    role = "worker"
  }

}

telemetry {
  publish_allocation_metrics = true
  publish_node_metrics       = true
  prometheus_metrics         = true
}

log_level = "INFO"

I’ve figured out that the client stanzas were not merging as expected.

So, I think I needed to declare all the information that would have been in a single stanza, like in the client in a single file, rather than spreading it across multiple files as the documentation seems to have suggested. I’ve linked to this below:

The host volumes are appearing, as are any host or node class info.

The host networks now reliably appear when I query using the node status trick before too

# 
just nomad node status -verbose -json 8654f2dc  | jq  '.HostNetworks'                                                                                                      

the output looks like so:

{
  "hetzner-1": {
    "CIDR": "10.0.0.0/24",
    "Interface": "ens10",
    "Name": "hetzner-1",
    "ReservedPorts": ""
  }
}

This makes me think the network is now being recognised on the client, even if it doesn’t appear in the UI when you visit

https://nomad.emberapp.com/ui/clients/LONG_ALPHANUMERIC_ID

Planning and running jobs with networks still a miserable failure

The nomad plan steps for a job still fail to find any nodes that are eligible and have this host_network though.

There is only one node I am trying to work with here, which has the hostname app3.greenweb.org.

Here’s my job

job "staging_server" {
  datacenters = ["dc1"]

  type = "service"

  group "staging" {

    count = 1

    volume "persistent_data" {
      type      = "host"
      read_only = false
      source    = "persistent_data"
    }

    network {
      mode = "host"
      port "rabbit" {
        static       = 5672
        host_network = "hetzner-1"
      }
      port "redis" {
        static       = 6379
        host_network = "hetzner-1"
      }
      port "mariadb" {
        static       = 3306
        host_network = "hetzner-1"
      }

    }

    task "rabbit" {
      driver = "docker"

      config {
        image = "rabbitmq:3"
        ports = ["rabbit"]
      }

      resources {
        cpu    = 250 # 500 MHz
        memory = 256 # 64MB
      }

      constraint {
        attribute = "${attr.unique.hostname}"
        value     = "app3.greenweb.org"
      }
    }

    task "mariadb" {
      driver = "docker"

      env {
        MARIADB_ROOT_PASSWORD = "HASSLE_TO_ADD_WITHOUT_A_TEMPLATE_FILE"
      }

      config {
        image = "mariadb:10.9"
        ports = ["mariadb"]

      }

      resources {
        cpu    = 250 # 500 MHz
        memory = 256 # 64MB
      }

      constraint {
        attribute = "${attr.unique.hostname}"
        value     = "app3.greenweb.org"
      }

      volume_mount {
        volume      = "persistent_data"
        destination = "/var/lib/mysql"
        read_only   = false
      }

    }

    task "redis" {
      driver = "docker"

      config {
        image = "redis:6"
        ports = ["redis"]

      }

      resources {
        cpu    = 250 # 500 MHz
        memory = 256 # 64MB
      }

      constraint {
        attribute = "${attr.unique.hostname}"
        value     = "app3.greenweb.org"
      }

    }
  }
}

what am I doing wrong?

Is there a way to debug the evaluation process to see why nodes are being filtered out from a job list of eligible ones?

Where should I be looking for a working network stanza to learn from? I’m pretty much at wall here now.

OMG.

I changed the name from hetzner-1 to hetzner in the client config and the job and it worked.

I’ve spent hours on this!

I don’t understand why this is the case - is this a bug?

I figured having a hyphen in a name would be ok if it was in a string, surely?

Adding this for future me.

In the end the problem wasn’t really Nomad.

With Hetzner cloud, when you set up a private network, you need to pay attention to the name of the interface when you declare a host network, along these lines:

client {
  host_network "hetzner" {

  interface = "my-interface-name"
  cidr      = "10.0.0.0/24"
  reserved_ports = "22,80"
  }
}

This is because it isn’t consistent across all machine types.

For the older VMS, it was ens10, but for the new machines it’s a different network interface name `

Network CX, CCX*1 CPX, CAX, CCX2, CCX3
First attached network ens10 enp7s0
Additional interfaces (second) ens11 enp8s0
Additional interfaces (third) ens12 enp9s0

This is documented on the corresponding page in Hetzner’s docs.

Looking back what would have saved me lots of heartache

I’ve learned that Nomad lets you set up a host network with an named interface that doesn’t exist on a host machine, and it doesn’t give you any indication that you’re connecting to a non-existent network interface.

So this:

client {
  host_network "hetzner" {

  interface = "ens12345"
  cidr      = "10.0.0.0/24"
  reserved_ports = "22,80"
  }
}

Will show up in the node info output when you call:

nomad node status -verbose <node_id>

And in the info you’ll see output like this as text:

Host Networks
Name     CIDR         Interface  ReservedPorts
hetzner  10.0.0.0/24  ens12345     22,80

Which can give the impression that everything is working, except when you try to place a job, you’ll see output like this:

Scheduler dry-run:
- WARNING: Failed to place all allocations.
  Task Group "<MY_JOB_NAME>" (failed to place 1 allocation):
    * Constraint "missing host network \"hetzner\" for port \"http\"": 1 nodes excluded by filter

The host network isn’t missing - it’s clearly there! I think it’s s more a case that the IP address allocated to the node can’t be determined, and as a result, the node fails the text against the constraint.

Having some logs when a nomad client starts up, warning that a host_network interface could not be determined would really help, because there appears to be a bunch of network output logged anyway on startup. Here’s some of the start-up logs:

Sep 15 14:23:32 app3.my-org.org nomad[80209]:     2023-09-15T14:23:32.853Z [WARN]  client.fingerprint_mgr.network: unable to parse speed: path=/usr/sbin/ethtool device=eth0
Sep 15 14:23:32 app3.my-org.org nomad[80209]:     2023-09-15T14:23:32.853Z [WARN]  client.fingerprint_mgr.network: unable to parse speed: path=/usr/sbin/ethtool device=lo
Sep 15 14:23:32 app3.my-org.org nomad[80209]:     2023-09-15T14:23:32.856Z [WARN]  client.fingerprint_mgr.network: unable to parse speed: path=/usr/sbin/ethtool device=eth0
Sep 15 14:23:32 app3.my-org.org nomad[80209]:     2023-09-15T14:23:32.861Z [WARN]  client.fingerprint_mgr.network: unable to parse speed: path=/usr/sbin/ethtool device=enp7s0
Sep 15 14:23:32 app3.my-org.org nomad[80209]:     2023-09-15T14:23:32.864Z [WARN]  client.fingerprint_mgr.network: unable to parse speed: path=/usr/sbin/ethtool device=docker0

Seeing the interface name is what led me to the fix in the end.

I hope this helps someone else in future (or even future me, next time I’m troubleshooting network issues…)

1 Like