I recently deployed a Nomad cluster using Ansible, and I’m facing an issue with the bootstrap_expect configuration.
Here’s the situation:
- Configuration:
- I have three servers in my cluster, each with the Nomad agent installed.
- I used Ansible to set up the configuration files, including client.hcl and server.hcl.
- In the server.hcl file, I set bootstrap_expect = 3 to expect three server nodes.
- Bootstrap Failure:
- When I start the Nomad agents with bootstrap_expect = 3, the servers complain about not being able to elect a leader, and the cluster remains down.
- Workaround Attempt:
- As a workaround, I set bootstrap_expect = 1 in server.hcl, so the servers start correctly with a single server node.
- After starting the agents, I manually made each server join the primary server using the nomad server join command.
- Then, I changed bootstrap_expect = 3 again in the configuration and restarted the nomad.service.
- Seeking Advice:
- While this workaround temporarily solves the issue, it doesn’t feel like the correct approach.
- I would like to understand why the Nomad cluster fails to elect a leader with bootstrap_expect = 3.
- Is there a better way to handle this situation during the initial cluster setup?
I’m open to suggestions, troubleshooting steps, and any insights you may have. I’ve attached the relevant configuration files below for your reference.
- client.hcl:
data_dir = “/var/lib/nomad”
log_level = “INFO”
bind_addr = “0.0.0.0”
datacenter = "datacenter”
log_level = “INFO”
client {
enabled = true
servers = [“server1:4647”, “server2:4647”, “server3:4647”]
}
advertise {
http = “client-node-ip:4646”
rpc = “client-node-ip:4647”
serf = “client-node-ip:4648”
}
- server.hcl:
data_dir = “/var/lib/nomad”
log_level = “INFO”
bind_addr = “0.0.0.0”
datacenter = "datacenter”
log_level = “INFO”
server {
enabled = true
bootstrap_expect = 3
}
advertise {
http = “server-node-ip:4646”
rpc = “server-node-ip:4647”
serf = “server-node-ip:4648”
}
- nomad.hcl:
datacenter = “datacenter”
data_dir = “/var/lib/nomad”
log_level = “INFO”
log_file = “/var/log/nomad/nomad.log”
bind_addr = “0.0.0.0”
server {
enabled = true
}