I am deploying a cluster with Consul and Nomad on edge devices that have dynamic IP addresses. I don’t have the option to set up DNS, so I’m using IPs. I see that both Consul and Nomad handle IP address changes very poorly. Even if I do ‘consul join’ after the address change, it doesn’t help because the cluster remembers the old addresses and doesn’t accept connections from new addresses under the same node name.
Do I understand correctly that the only option for me is to use DNS (will that work for sure?) or make the IP addresses static?
What about the autojoin functionality of Nomad? It uses Consul service discovery to form the cluster, and it does it based on IP addresses. Should I give up autojoin and switch to join by DNS?’
I tried to use DNS, and faced with same issues. Look’s like DNS is only used for initial setup, and then both Consul and Nomad talks to each other via IP addresses that are saved somewhere in state files. Am I right? There is no way to handle IP addresses change?
Consul does have the capability to modify IP addresses, but this process must be approached correctly. Consul records the peer set, which comprises agent node IDs and their respective IP addresses, within its raft db. Upon the Consul agent’s startup, this member configuration is read from the raft db. You can validate this information using the command
consul info | grep latest_conf. It’s important to note that if an agent’s IP address is altered and doesn’t match the IP in the raft database, the agent won’t be able to successfully join the cluster.
To change an agent’s IP address, it’s necessary to perform the task in a careful manner. This involves removing the agent from the raft peer set gracefully using the
consul leave command. This ensures that other nodes see the agent as having a “left” status. It’s important not to abruptly stop or terminate the agent process, as this can result in the agent transitioning to a “failed” status. Following the graceful exit, you can proceed to change the agent’s IP address. Afterward, you can rejoin the cluster using the
consul join command.
Thank you for detailed info!
And if the address of one of the nodes has already changed, is it impossible to perform the leave operation, am I right?
If you change the IP address of one of the nodes, the current Consul cluster will interpret this change as the node being unreachable and it will mark it as “failed.” This happens because the new IP address isn’t present in the peer set stored in the raft db. Consequently, the cluster loses one of its voters, leading to a decrease in the quorum by one.
In this scenario, the node with the updated IP address can’t perform the
consul leave because it’s no longer considered part of the cluster.