Is this for a single datacenter configuration, or are you running multiple datacenters federated together?
Can you please provide some more information on the process you took to change the main DC?
I tried this out with a dev-mode agent. I didn’t see anything in the logs when I changed the primary DC and did a consul reload. Dev mode doesn’t reflect a production configuration very well. I’ll spin up a few nodes and play with the config over the next couple of days.
I have this GitHub issue to track a request to rotate primary DC’s: https://github.com/hashicorp/consul/issues/7817. Currently, primary DC’s have a blast radius if they are changed, and I am trying to collect data on what happens in different clusters when this happens. Please to the issue if you’re interested to track.
Can you post your server config file, as well as the command you are using to run the servers?
If anyone else reading this wants to try, feel free to post what they did and the results .
How much of the primary DC state is persisted in backups? My use case is reasonably small, I only have a couple of clusters and could feasibly take the whole lot down over a weekend to migrate.
I’ll do my own testing but if I restored a backup of the existing primary to a new cluster would that be safe? Then I could effectively reset the others, join them to the new primary and I think everything would still work…
We also have 2 datacenters and are contractually required to rotate between them once per year. We also can not use public cloud due to contracts, so making a 3rd datacenter in AWS isn’t an option.
According to the replication documentation, setting up replication will destroy all tokens in the secondary site.
If all the ACLs are managed with terraform, theoretically it wouldn’t be too hard to re-run terraform apply to recreate the tokens after the switch.