Recommended Approach(es) for Staging with Terraform

schwarzzz · August 21, 2020, 9:30am

I’m working with Terraform for a few weeks now and while our solution is growing, I constantly scratch my head on how to use Terraform in “the best way” (for our project).

When deploying your infrastructure in multiple, separated stages (e.g. DEV, QAT, PROD) there is always the struggle between flexibility, reproducibility (of deployments), and reduction of redundancy (in the Terraform configuration).

In most Terraform examples you find on the web or in literature, you’ll find a folder structure like

live
|- dev
|- qat
|- prod\

with each sub-folder hosting a terraform root module.

IMO, this solution prefers flexibility over reproducibility and reduction redundancies. It probably also helps to reduce complexity within referenced modules. That’s fine.

However, one could argue, that Terraform Workspaces (CLI, not cloud) could be an alternative approach, that prefers reproducibility and reduction redundancies over flexibility.

But then the Terraform documentation states that workspaces are not suitable for staging scenarios (pointing out the probably separated backends for each stage).

From https://www.terraform.io/docs/state/workspaces.html#when-to-use-multiple-workspaces:

In particular, organizations commonly want to create a strong separation between multiple deployments of the same infrastructure serving different development stages (e.g. staging vs. production) or different internal teams. In this case, the backend used for each deployment often belongs to that deployment, with different credentials and access controls. Named workspaces are not a suitable isolation mechanism for this scenario.

Instead, use one or more re-usable modules to represent the common elements and then represent each instance as a separate configuration…

I do not agree on that argument.

While it is probably true, that the backends are separated by stage, the backend information can easily be injected (e.g. via environment variables). So, it is possible to deploy the same configuration to different stages with different backends, as long as the deploying entity has the proper permissions (or different service principals are used for deployment to each stage).

Of course, writing to different backends makes the use of workspaces pointless (you’ll only have one state in the backend and there is no need for disambiguation of the name) but it wouldn’t hurt, either.

Are there further arguments against using a single Terraform configuration for staging (with or without using workspaces)?

Which approach do you follow and why?

Thanks a lot!

stuart-c · August 21, 2020, 5:19pm

We do a combination of different backends & different workspaces.

So “live” and “dev” environments are in different backends (in fact stored in different AWS accounts). Within those backends we then use workspaces. So for example in the dev backend we might have a dev, qa, stress test, etc. workspace for each environment (we actually use the default workspace for one environment per backend)

apparentlymart · August 22, 2020, 12:59am

The recommendations in that guide are aimed at those who are running Terraform in the default way and want to keep all of the hostnames/regions/etc in their main configurations so they can just set up their credentials and run terraform init+terraform apply directly.

If you’re running Terraform in automation then indeed you have a number of other options available, such as generating backend configurations dynamically, or arranging for the automation to set environment variables correctly.

That’s a fine way to go if you’re willing to commit to only running Terraform in your automation system, and many teams do work that way, but it’s worth keeping in mind that the further you diverge from the “default workflow” (as that workspaces guide is assuming), the harder it can be to run Terraform in a different way if you need to, such as if you need to switch to a different automation solution, or if your automation has an outage and you need to run Terraform locally in an emergency situation.

As is often the case with any system involving many cooperating components, there’s no single right answer that’s correct for all cases. You’ll need to consider your specific needs, preferences and assumptions to decide what tradeoffs to make. The guidance in the Terraform documentation is intended as a starting point based on what we’ve learned from discussions with folks who have tried different approaches, but if you find that the advice doesn’t suit your needs then Terraform is a pragmatic tool that supports a number of “non-standard” workflows to allow for a wider variety of use-cases.