Active - Active DR solution with Nomad

steve.smith · May 10, 2024, 3:21pm

I have a Nomad 3 server cluster in Prod and the same thing in DR. The Prod and DR are active/active. So overall it’s a 6 node cluster.
We recently had a failure of one of the 2 sites, and realized that Nomad could not reach a consensus, as it need 4 of 6 servers.
We easily worked around this, but I would rather not have a manual action.

Is there an easy way to configure Nomad to automatically adjust to take the consensus from the 3 servers when 3 of 6 servers fail?

tomqwpl · May 14, 2024, 7:33am

The nature of Raft consensus is that you need an odd number of nodes in the decision making set. It will continue to operate while more than half are available. For example if you have an odd number and you split a cluster in two, there will always be one side with more than half, so that side knows it can continue to make decisions, the other side knows it can’t. Have an even number, and that no longer works. You can split it equally in two and neither side can guarantee that it has the quorum it needs to make decisions, so nothing will get done.

So the intention of Raft is that you have typically 3 or 5 nodes in the decision making set, and do you actually inherently have active-active-active DR (or active-active-active-active-active). You therefore don’t need, indeed aren’t intended, to have 3 nodes and then duplicate that in an active-active setup.

Raft based things work well in cloud environments like AWS where it’s easy to set up three nodes in separate availability zones and have low latency links. They seem less sell suited to a more traditional data centre scenario where you might have two data centres and want to configure an active-active setup. I’ve been pondering the same problem, and so far haven’t come up with a way of utilising Nomad in that kind of environment. If you split it 2 nodes in one and 1 node in the other, then it’ll be fine if correct data centre goes, but not if the wrong one goes, kind of defeating the object. I’ll be interested in any suggestions or links to how to make use of Nomad or other consensus based solutions in a traditional two centre active-active scenario.

Topic		Replies	Views
Is two node cluster supported in nomad? Nomad	2	678	July 20, 2022
Join 3 servers cluster to another 3 server cluster: jobs lost Nomad	6	197	November 20, 2023
Minimal HA cluster for Nomad Nomad	3	1838	January 14, 2022
1 Machine per Cluster Nomad	2	1111	November 30, 2020
No Cluster Leader when cluster node is down Nomad	6	3953	November 17, 2021

Active - Active DR solution with Nomad

Related topics