Unexpected Consul Server Resolution Issue Across VPC Peering

We have multiple VPCs in AWS and had VPC peering set up between the staging and prod environments of a particular VPC. Suddenly, we started facing an issue where the consul endpoints in prod were resolving to the staging consul server. We had the VPC peering connection for a long time and never had any issues before.

Upon checking the logs, we found that the servers in staging were adding the prod consul as members. We are not sure what could have caused this issue.

Here is a log message we found for this issue:

Jan 12 04:38:18 consul[58670]: 2024/01/12 04:38:18 [INFO] consul: adding server ip-10-100-0-140 (Addr: :8300) (DC: dc1)

Hi @sudharsans-nd,

Based on what you have shared, it looks like both your Prod and Staging environments must have merged. You would not want to see this, especially in a production cluster.

To verify this, run consul members in one of your clusters, and if you see nodes from both clusters, then this is the case.

Ideally, each cluster should have a separate gossip encryption key to avoid getting into such a situation. This video will give a quick summary of this scenario.: https://youtu.be/lAL7ocZQprE