Consul Cluster Architecture over 2 AZs

Hello HashiCorp community! I’m facing a small infrastructure complexity issue regarding a Consul cluster (without using service mesh features).

I have an infrastructure of 6 servers (not necessarily related to Consul). Unfortunately, there are only two availability zones (AZs), and I would like to be able to tolerate the loss of one AZ.

From what I can read in the documentation, I’ll have a consensus problem in case of losing an AZ when having a Consul Server cluster with 3, 5, or 6 nodes, unless I use Redundancy Zones (which is currently not feasible).

I’m wondering if a configuration using WAN federation and two clusters with 3 + 3 nodes per AZ could be reliable. The applications have horizontal scaling between the two AZs. From what I understand in this context, if I lose one AZ, all my services in that AZ become unavailable, but the services in the remote AZ remain available, served by its local Consul cluster.

Do you have any thoughts on this issue?

As you’ve correctly observed, there’s no way to split a consensus quorum over two AZes, such that either AZ remains functional if the other fails.

Your logic seems sound to me.

1 Like