Toggling resources across three environments

We are relatively new to managing our Azure resources with Terraform Cloud and are running into a few bumps in the road which I guess others would have run into in their journeys. Any guidance would be much appreciated.


We are using Terraform Cloud(TFC).

We have three azure subscriptions - Test, Stage, Prod

We have three TFC workspaces - Test, Stage, Prod

We have three long lived branches - Test, Stage, Prod

Our workflow goes Test → Stage → Prod

On paper this seems reasonable and with one person working in this way it all runs smoothly. The first problem we’ve run into is.

What do we do if one person is testing something in the Test subscription which may take a week to test and someone else in the meantime wants to make a small change to an existing resource?

In our case we have someone running a proof of concept with Grafana and InfluxDB which includes records deployed in an application gateway…

…and someone else now wants to change a records in the application gateway.

If the second person makes the change and then wants to rebase the test branch onto stage it’s going to deploy the Grafana and InfluxDB and the application gateway record into stage which is not wanted?

I looked at the Blue Green deployment blog post which uses a boolean toggle to set the count on a resource to 1 or 0 and maybe we could use something like that but we have three environments?

I might not be thinking about it right but if we tried to use a similar approach to blue green we’d need a lot of if…then or similar to mask out the unwanted resources.

Any thoughts??

Cheers
Phill

Running ad-hoc PoCs, whose configuration overlaps with a test → stage → prod promotion-style workflow is the fundamental problem here, and isn’t something Terraform can magically make go away.

Ideally you’d be using a totally different Git repo and Terraform workspace, as well as using entirely separate Azure resources.

If you can’t do all of those things, then at least using entirely separate Azure resources is the most important one.

If you can do that, then you could put all of the PoC into a separate Terraform module, and just put a single

  count = var.poc_enabled_in_this_environment ? 1 : 0

on the module block referencing the PoC module.

If you can’t manage to separate it to wholly separate resources, then yes, you’re going to end up with PoC-related conditionals sprinkled through your production code.

In either case, this wouldn’t be a Blue/Green situation, since that specifically involves multiple production environments - it’s more like using a Feature Flag approach.