Blue/Green workspace layout

Are there any reference implementations how to handle Blue Green deployments in terraform?

Hi @tsiq-dan,

Terraform does not have any specialized features for blue/green deployment. There are a few different ways to achieve it, but which one is appropriate for your use-case will depend on what exactly you are trying to deploy.

Can you share a little more detail? It would be helpful to know which platform(s) you are deploying to and what services you are planning to use.

Say for example we have a AWS EB application stack where we want to deploy a prod-blue and prod-green versions. Also lets say there supporting infrastructure such as IAM roles, Permissions, etc.

We could spin up two different workspaces (prod-blue/prod-green) that reference the same IAM and EB modules. In this model, an update to the module will impact both environments effectively only allowing for the app layer to be blue green. Additionally there is a decent amount of duplication of code given that the only difference between the prod-blue and prod-green layers are if one is enabled or not.

I’m assuming for the sake of this reply that “EB” here means Elastic Beanstalk.

Elastic Beanstalk has a built-in mechanism for managing Blue/Green deployments, and in such cases we usually suggest relying on that existing feature rather than re-implementing a blue/green deployment pattern in Terraform itself.

Unfortunately the Terraform AWS provider does not directly expose the SwapEnvironmentCNAMEs operation that provides the blue/green environment pivot in Elastic Beanstalk.

With that said, my recommendation would be to use Terraform to model the changes to the individual environments but to use some custom automation outside of Terraform to do the SwapEnvironmentCNAMEs call to pivot between them once you’re satisfied that your terraform apply has produced a suitable result.

A key decision point in modelling this sort of multi-step deployment in Terraform is to decide whether the Terraform configuration itself is participating in the multi-step deployment, or if instead Terraform is just managing the external system that is handling the deployment. Based on your comment, it sounds like you would like the Terraform configuration itself to also roll out in multiple steps.

The way I achieved that in a previous role (before I joined the Terraform team) was to write the infrastructure that will participate in deployment in a separate, versioned Terraform module. Then I wrote a simple top-level configuration that describes the overall desired state and that changes multiple times during a multi-step deployment.

At initial rollout, that top-level configuration contains only one instance of the module:

module "happyapp-20190925-1" {
  source  = ""
  version = "1.0.0"

  # Pass the current "generation" identifier into the module
  # in case it's needed to dedupe object names, assign
  # tags, etc.
  generation = "20190925-1"

  # ...and any other "per-generation" values you might need
  # to set, though best to keep this to a minimum to reduce
  # overhead during multi-step deploys.

When you want to roll out a change, you’d then add a new module block alongside that first one, describing the environment we intend to pivot to while leaving the current live environment unchanged:

module "happyapp-20190925-1" {
  source  = ""
  version = "1.0.0"

  generation = "20190925-1"

module "happyapp-20190926-1" {
  source  = ""
  version = "1.1.0"

  generation = "20190926-1"

In this case there is a new version of the module too, so the module changes are also included in the gradual roll-out. If the module itself hasn’t changed then the version number might stay the same and other aspects of the configuration change instead.

We can terraform apply that, verifying that the plan does indeed leave the old “generation” unchanged and only creates a new one. Once that completes successfully, you have two environments, and can verify that the new one is behaving as expected using whatever testing/monitoring you’d normally use to decide if the new environment is functioning correctly.

Next you’d use scripting of your own outside of Terraform to run the Elastic Beanstalk SwapEnvironmentCNAMEs operation to pivot over to the new environment. After that succeeds, you’ll presumably monitor the new environment a little more to make sure that it’s working for your end-users. You can decide to roll back at this point if you want by calling SwapEnvironmentCNAMEs again, or you can move on to the next rollout step.

Finally, we remove the module "happyapp-20190925-1" block that represents the now-idle old environment and run terraform apply one more time, verifying again that the plan only affects the old module. Once that apply succeeds, your deployment is complete.

This is an example of using Terraform as a building block within a larger process rather than the entire solution. Terraform is not designed as an application deployment tool and so its built-in features alone are not sufficient here, but you can use it as part of broader automation in order to handle the problem of moving from a current state to a desired state, even if your rollout process has multiple intermediate “desired states”.

If you’d like to build automation around Terraform like this, you might find the terraform show -json command useful in order to get a machine-readable description of Terraform’s plan so that your automation tool can guarantee that it won’t apply a change that would affect the current production environment.

With that said, you might also reasonably decide that Terraform is not the right tool for this job. As I mentioned above, deploying applications isn’t a core use-case for Terraform and although it can be a useful building block if your deployment process includes various cloud API calls that Terraform providers can effectively abstract over, Elastic Beanstalk itself is already an application deployment abstraction and you might find it more straightforward in the end to script your automation directly against its own API, and potentially use Terraform just for the long-lived supporting infrastructure such as VPCs, DNS zones, etc.

I hope that helps!