No Cluster Leader - Nomad

Hello,

I have setup 3 server node and was able to create cluster but none of them is assigned as leader. i have attached the screenshot for same if anybody has face the same problem.

Hi @nirankush.tyagi94,

Could you provide some additional information such as the configuration you are using and logs from the servers?

Thanks,
jrasell and the Nomad team

Hi jrasell,

  1. nomad.service attached
  2. i have attached server.hcl file which is same for all three servers only name is changed.
  3. journalctl log attached for Nomadserver - 1

please let me know if i miss anything.

nomad-service.txt (1.4 KB)


Hello,

Please let me know if anyone can help

Hi @nirankush.tyagi94,

The journalctl logs indicate the process is failing to start which could mean a configuration problem. Can you share the full logs of the process which according to your config file should be within /etc/nomad.d/krausen.log? If they are not there, I would also suggest checking /var/log/syslog.

Thanks,
jrasell and the Nomad team

Hello,
attaching logs. logs are same on all three servers.

i have attach logs from server from which i’m running nomad server members

Hi @nirankush.tyagi94,

I’ll need the logs of the server which is failing to start. The cluster is not able to elect a leader because it’s likely not reaching the bootstrap_expect threshold, which is set to 5. If you only have 3 servers, this should be set to 3, not 5.

Thanks,
jrasell and the Nomad team

Hi Jrasell,

i tried changing bootstrap_expect threshold to 3 but it gives the same error.

Attaching all logs from server-1 (krausen.log file - latest


) and service running status for your reference.

Please check.

Thanks
Nirankush
krausen.txt (2.2 MB)
krausen-1698841764257166642.txt (2.8 MB)
krausen-1699017553343635619.txt (4.2 MB)
krausen-1699103953712998382.txt (2.4 MB)

hey jrasell

any luck ?, if you could please check.

I had the same issue. After changing bootstrap_expect to 3, everything works fine.

Hi @nirankush.tyagi94,

Your logs contain errors as shown below, which indicate the servers are struggling to connect with each other. Connectivity issues would stop a quorum being reached and a leader elected. If possible, I would also suggest deleting the data directory on each server and starting fresh, to ensure stale data is not a factor.

2023-11-14T13:21:14.067Z [ERROR] nomad.raft: failed to make requestVote RPC: target="{Voter 051fb966-baf3-b3f3-a761-f7197df468d4 10.231.230.190:4647}" error="dial tcp 10.231.230.190:4647: connect: no route to host" term=838
2023-11-14T13:21:14.067Z [ERROR] nomad.raft: failed to make requestVote RPC: target="{Voter 43e670fa-c81e-caa3-633a-e6ee83c177c3 10.231.230.213:4647}" error="dial tcp 10.231.230.213:4647: connect: no route to host" term=838

Thanks,
jrasell and the Nomad team

Thanks jrasell for your help , i find out the issue rpc port : 4647 was not opened on all three servers.

Now i’m trying to add client node to this cluster but getting this error in journalctl -u nomad.

See the top line about consul not being able to query datacenters and start from there.

Hi @nirankush.tyagi94,

Could you share the client configuration you are using? If you are using Consul service discovery for the clients to discover the servers, then the error as pointed out would be causing this.

I think it is also useful to point out this set of tutorials, which might help guide you in setting up a Nomad cluster: Cluster Setup | Nomad | HashiCorp Developer

Thanks,
jrasell and the Nomad team

Thanks Jrasell, i was able to add client node to cluster.

Thanks , it worked for me

Hi jrasell,

i was able to setup a cluster with two server nodes and one client node.

but when i tried to plan or deploy a job it doesn’t get deployed to cluster , getting warning (attached) and job remains at desired state does not get placed.

Thanks
Nirankush

can anyone help me on this.

@nirankush.tyagi94, What driver are you using for your job?

The error you are getting is because your client node doesn’t have the driver enabled.

Can you share the output of nomad node status -short <your-client-node-id>

Thanks for your response, not able see the drivers but i m able to run containers locally on this node. any suggestion how should i install driver ?

ID = 0f747bd0-2444-fd49-5f27-756a1497f9f3
Name = nomad_client_1
Node Pool = default
Class =
DC = dc1
Drain = false
Eligibility = eligible
Status = ready
CSI Controllers =
CSI Drivers =
Host Volumes =
Host Networks =
CSI Volumes =
Drivers =