Hello, I’m doing a lab for experimenting extreme issues while operating a nomad cluster.
I tried this situation:
(group1) 3 nomad servers joined together (server1, server2, server3)
(group2) 3 nomad servers joined together (server4, server5, server6)
I created some jobs on group1 and some on group2.
On a random server of a random group I executed (example):
server4# nomad server join $IPSERVER1
So I “merged” group1 with group2.
Basically I have merged two clusters in this way… Now I see jobs that was defined in group1 but all jobs on group2 are lost…
Can someone explain technically what is happened in this case that produced the loosing of the jobs on the group2?
Group2 and group1 should be the same, so I can loose potentially also jobs of group1… but does not happen in my tests…
I can’t understand the logics that are under the hood.
Hi, can you show your agents in cluster configuration?
Yes, all servers/agents are correctly shown as members and peers in the cluster correctly and the cluster elected a new leader correctly.
(Before joining each group, they had a own leader obviously)
Ok, but can you share configuration files and logs? It’s hard to say something without it
First of all, please note that this is not a supported scenario.
Each Nomad cluster uses the Raft Consensus Protocol to store its state. In your case, both group1 and group2 clusters have their distinct data.
ref: Consensus Protocol | Nomad | HashiCorp Developer
It is not safe to merge two Nomad Server Clusters as you won’t be able to predict the state the resultant cluster would end up with. Which data wins out of the two clusters would be purely handled by the raft protocol.
I would recommend you try to understand the Raft protocol to understand why merging the clusters isn’t safe.
I hope this helps.
ok thanks @Ranjandas , so in order to increase the number of servers in a nomad server cluster I have to join only new servers that are not already joined with others.
is it true the sentence that I wrote?
Yes you are right. But make sure that the number of server agents are 5 at the max.
The recommended configuration is to either run 3 or 5 Nomad servers per region. This maximizes availability without greatly sacrificing performance.
Ref: Consensus Protocol | Nomad | HashiCorp Developer