How to configure bind_addr and advertise addresses

Hello,

I’m trying to setup nomad in a server with a private network and a public ip. The private network ip is 10.2.0.2

I can’t figure out how to set the parameters bind_addr

This is the configuration:

advertise {
  http = "<private_ip>"
}

server {
  enabled = true
  bootstrap_expect = 1
}

And this is the output:

Nomad agent started! Log data will stream in below:
2020-10-28T16:28:03.624+0100 [WARN]  agent.plugin_loader: skipping external plugins since plugin_dir doesn't exist: plugin_dir=/opt/nomad/data/plugins
2020-10-28T16:28:03.629+0100 [INFO]  agent: detected plugin: name=exec type=driver plugin_version=0.1.0
2020-10-28T16:28:03.629+0100 [INFO]  agent: detected plugin: name=qemu type=driver plugin_version=0.1.0
2020-10-28T16:28:03.629+0100 [INFO]  agent: detected plugin: name=java type=driver plugin_version=0.1.0
2020-10-28T16:28:03.629+0100 [INFO]  agent: detected plugin: name=docker type=driver plugin_version=0.1.0
2020-10-28T16:28:03.629+0100 [INFO]  agent: detected plugin: name=raw_exec type=driver plugin_version=0.1.0
2020-10-28T16:28:03.629+0100 [INFO]  agent: detected plugin: name=nvidia-gpu type=device plugin_version=0.1.0
2020-10-28T16:28:03.641+0100 [INFO]  nomad.raft: initial configuration: index=1 servers="[{Suffrage:Voter ID:<private_ip>:4647 Address:<private_ip>:4647}]"
2020-10-28T16:28:03.641+0100 [INFO]  nomad.raft: entering follower state: follower="Node at 10.2.0.2:4647 [Follower]" leader=
2020-10-28T16:28:03.642+0100 [INFO]  nomad: serf: EventMemberJoin: ubuntu-8gb-hel1-1.global 10.2.0.2
2020-10-28T16:28:03.642+0100 [INFO]  nomad: starting scheduling worker(s): num_workers=4 schedulers=[service, batch, system, _core]
2020-10-28T16:28:03.642+0100 [WARN]  nomad: serf: Failed to re-join any previously known node
2020-10-28T16:28:03.643+0100 [INFO]  client: using state directory: state_dir=/opt/nomad/data/client
2020-10-28T16:28:03.643+0100 [INFO]  nomad: adding server: server="ubuntu-8gb-hel1-1.global (Addr: 10.2.0.2:4647) (DC: dc1)"
2020-10-28T16:28:03.643+0100 [INFO]  client: using alloc directory: alloc_dir=/opt/nomad/data/alloc
2020-10-28T16:28:03.652+0100 [WARN]  client.fingerprint_mgr: failed to detect bridge kernel module, bridge network mode disabled: error="could not detect kerne l module bridge, could not detect kernel module bridge"
2020-10-28T16:28:03.653+0100 [INFO]  client.fingerprint_mgr.cgroup: cgroups are available
2020-10-28T16:28:03.659+0100 [WARN]  client.fingerprint_mgr.network: unable to parse speed: path=/usr/sbin/ethtool device=eth0
2020-10-28T16:28:03.660+0100 [WARN]  client.fingerprint_mgr.network: unable to parse speed: path=/usr/sbin/ethtool device=lo
2020-10-28T16:28:03.664+0100 [WARN]  client.fingerprint_mgr.network: unable to parse speed: path=/usr/sbin/ethtool device=eth0
2020-10-28T16:28:03.669+0100 [WARN]  client.fingerprint_mgr.network: unable to parse speed: path=/usr/sbin/ethtool device=enp7s0
2020-10-28T16:28:03.676+0100 [INFO]  client.plugin: starting plugin manager: plugin-type=csi
2020-10-28T16:28:03.676+0100 [INFO]  client.plugin: starting plugin manager: plugin-type=driver
2020-10-28T16:28:03.677+0100 [INFO]  client.plugin: starting plugin manager: plugin-type=device
2020-10-28T16:28:03.678+0100 [INFO]  client: started client: node_id=cc49d8f2-ea25-33a5-34fc-c5b307254c79
2020-10-28T16:28:05.534+0100 [WARN]  nomad.raft: heartbeat timeout reached, starting election: last-leader=
2020-10-28T16:28:05.534+0100 [INFO]  nomad.raft: entering candidate state: node="Node at 10.2.0.2:4647 [Candidate]" term=594
2020-10-28T16:28:05.536+0100 [ERROR] nomad.raft: failed to make requestVote RPC: target="{Voter <private_ip>:4647 <private_ip>:4647}" error="dial tcp 135.1 81.86.239:4647: connect: connection refused"
2020-10-28T16:28:06.791+0100 [WARN]  nomad.raft: Election timeout reached, restarting election
2020-10-28T16:28:06.791+0100 [INFO]  nomad.raft: entering candidate state: node="Node at 10.2.0.2:4647 [Candidate]" term=595
2020-10-28T16:28:06.793+0100 [ERROR] nomad.raft: failed to make requestVote RPC: target="{Voter <private_ip>:4647 <private_ip>:4647}" error="dial tcp 135.1 81.86.239:4647: connect: connection refused"
2020-10-28T16:28:08.540+0100 [WARN]  nomad.raft: Election timeout reached, restarting election
2020-10-28T16:28:08.541+0100 [INFO]  nomad.raft: entering candidate state: node="Node at 10.2.0.2:4647 [Candidate]" term=596
2020-10-28T16:28:08.543+0100 [ERROR] nomad.raft: failed to make requestVote RPC: target="{Voter <private_ip>:4647 <private_ip>:4647}" error="dial tcp 135.1 81.86.239:4647: connect: connection refused"

What am I doing wrong? how is this supposed to work?

Thanks for your help!

Hi @AdrianRibao

The bind_addr configuration parameter can be set within your config file using the following snippet:

bind_addr = "10.2.0.2"

If you also wish to set the HTTP advertise address to the same private IP and this IP is fixed, you will need to set the config parameter to that of the IP. In this case your example config may look something like:

advertise {
  http = "10.2.0.2"
}

server {
  enabled          = true
  bootstrap_expect = 1
}

In the case you do not know what the private IP will be when the server launches with a baked configuration file, you can also use go-sockaddr/template format which would result in a config file looking something like:

advertise {
  http = "{{ GetPrivateIP }}"
}

server {
  enabled          = true
  bootstrap_expect = 1
}

I hope this helps.

Thanks,
jrasell and the Nomad team.

1 Like

Thanks @jrasell! I did that and is not working. See the logs:

==> Starting Nomad agent...
==> Nomad agent configuration:
       Advertise Addrs: HTTP: 10.2.0.2:4646; RPC: 10.2.0.2:4647; Serf: 10.2.0.2:4648
            Bind Addrs: HTTP: 10.2.0.2:4646; RPC: 10.2.0.2:4647; Serf: 10.2.0.2:4648
                Client: false
             Log Level: INFO
                Region: global (DC: dc1)
                Server: true
               Version: 0.12.7
==> Nomad agent started! Log data will stream in below:
    2020-10-29T09:56:45.896+0100 [WARN]  agent.plugin_loader: skipping external plugins since plugin_dir doesn't exist: plugin_dir=/opt/nomad/data/plugins
    2020-10-29T09:56:45.900+0100 [INFO]  agent: detected plugin: name=raw_exec type=driver plugin_version=0.1.0
    2020-10-29T09:56:45.900+0100 [INFO]  agent: detected plugin: name=exec type=driver plugin_version=0.1.0
    2020-10-29T09:56:45.900+0100 [INFO]  agent: detected plugin: name=qemu type=driver plugin_version=0.1.0
    2020-10-29T09:56:45.900+0100 [INFO]  agent: detected plugin: name=java type=driver plugin_version=0.1.0
    2020-10-29T09:56:45.900+0100 [INFO]  agent: detected plugin: name=docker type=driver plugin_version=0.1.0
    2020-10-29T09:56:45.900+0100 [INFO]  agent: detected plugin: name=nvidia-gpu type=device plugin_version=0.1.0
    2020-10-29T09:56:45.911+0100 [INFO]  nomad.raft: initial configuration: index=1 servers="[{Suffrage:Voter ID:<public_ip>:4647 Address:<public_ip>:4647}]"
    2020-10-29T09:56:45.911+0100 [INFO]  nomad.raft: entering follower state: follower="Node at 10.2.0.2:4647 [Follower]" leader=
    2020-10-29T09:56:45.911+0100 [INFO]  nomad: serf: EventMemberJoin: ubuntu-8gb-hel1-1.global 10.2.0.2
    2020-10-29T09:56:45.912+0100 [INFO]  nomad: starting scheduling worker(s): num_workers=4 schedulers=[service, batch, system, _core]
    2020-10-29T09:56:45.912+0100 [WARN]  nomad: serf: Failed to re-join any previously known node
    2020-10-29T09:56:45.912+0100 [INFO]  nomad: adding server: server="ubuntu-8gb-hel1-1.global (Addr: 10.2.0.2:4647) (DC: dc1)"
    2020-10-29T09:56:47.139+0100 [WARN]  nomad.raft: heartbeat timeout reached, starting election: last-leader=
    2020-10-29T09:56:47.139+0100 [INFO]  nomad.raft: entering candidate state: node="Node at 10.2.0.2:4647 [Candidate]" term=42521
    2020-10-29T09:56:47.141+0100 [ERROR] nomad.raft: failed to make requestVote RPC: target="{Voter <public_ip>:4647 <public_ip>:4647}" error="dial tcp <public_ip>:4647: connect: connection refused"
    2020-10-29T09:56:48.825+0100 [WARN]  nomad.raft: Election timeout reached, restarting election
    2020-10-29T09:56:48.826+0100 [INFO]  nomad.raft: entering candidate state: node="Node at 10.2.0.2:4647 [Candidate]" term=42522
    2020-10-29T09:56:48.828+0100 [ERROR] nomad.raft: failed to make requestVote RPC: target="{Voter <public_ip>:4647 <public_ip>:4647}" error="dial tcp <public_ip>:4647: connect: connection refused"
    2020-10-29T09:56:50.349+0100 [WARN]  nomad.raft: Election timeout reached, restarting election
    2020-10-29T09:56:50.349+0100 [INFO]  nomad.raft: entering candidate state: node="Node at 10.2.0.2:4647 [Candidate]" term=42523
    2020-10-29T09:56:50.351+0100 [ERROR] nomad.raft: failed to make requestVote RPC: target="{Voter <public_ip>:4647 <public_ip>:4647}" error="dial tcp <public_ip>:4647: connect: connection refused"

Why is still using the public_ip as shown in the logs?

[ERROR] nomad.raft: failed to make requestVote RPC: target="{Voter <public_ip>:4647 <public_ip>:4647}" error="dial tcp <public_ip>:4647: connect: connection refused"

This is my config:

data_dir = "/opt/nomad/data"
bind_addr = "10.2.0.2"

server {
  enabled = true
  bootstrap_expect = 1
}

client {
  enabled = true
  servers = ["10.2.0.2:4646"]
}

Thanks!

Hi @AdrianRibao,

Could you share how you are starting the Nomad agent? I ran your config locally on a Linux vagrant machine (changing the privateIP) and it worked without problem.

One thing that pops out is the log line Failed to re-join any previously known node. Do you have previous state within your data directory? If so, it would be worth clearing this out or using a new data directory for testing.

Thanks,
jrasell and the Nomad team.

1 Like

@jrasell Awesome! I deleted /opt/nomad and now it works as expected!

Since you asked, I started it with the default system.d command:

/usr/bin/nomad agent -config /etc/nomad.d

Thanks a lot!

Hi jrasell,
I am facing the same issue.
Is there a command to clear the previously know node?