How does new code get deployed to the cloud with Terraform?

I’m new to both Terraform and cloud services, so this question might be stupidly simple for the more experienced folks.

Terraform is all about deploying infrastructure. After configuring the backend (either storing locally, in an S3 bucket/DynamoDB, on Terraform Cloud etc.), every change to the terraform code followed by terraform apply will idempotently change infrastructure. This makes sense to me.

Where everything falls apart however, is when I want to run a service on one of the deployed virtual machines. Let’s take a hello world NodeJS + Express web server as an example.

In my world view, this is what I expected to be able to do:

  1. Define infrastructure and terraform apply
# see https://github.com/revosw/portfolioassignment/blob/main/main.tf
# for entire terraform file
resource "aws_instance" "web" {
  # Which virtual machine image this VM should be based on
  ami = "ami-0bd9c26722573e69b"
  # How much hardware resources does this instance need
  instance_type = "t3.micro"
  # What commands should be executed when the VM boots
  user_data = file("install.sh")
  # The ssh key pair to apply to this virtual machine
  key_name = "terraform-abcd"
  # Required for network setup
  security_groups = ["terraform-sg-abcd"]
}
  1. In the file install.sh set up my virtual machine
#! /bin/bash
adduser webmaster

# Install Volta - node and npm manager
curl https://get.volta.sh | bash
. ~/.profile
volta install node@17

# Clone repository
git clone https://github.com/revosw/terraform-project
cd portfolioassignment

# Install dependencies and start server
npm install
npm start
  1. Define github action to redeploy affected terraform instances whenever the Express web server application code changes

Point 3 is where it falls apart. The problem is that terraform will only redeploy when infrastructure changes. This is an application code change. Thus, terraform is gonna shrug and say “everything looks unchanged to me”.

So my question is: what tool am I missing to automatically deploy new application code to the cloud using some github action?

As with everything there are lots of different options depending on what you are trying to achieve & want to do.

You could just have Terraform handle the infrastructure and then use a different tool to deploy the application (such as Ansible).

If you want Terraform to deploy the application as well you need to decide how that’s going to happen. From your post it looks like you are just wanting the application to be installed & started when an EC2 instance boots up.

To make that work you need to ensure that the EC2 instance is destroyed & recreated when there is a new application to deploy. Looking at your script you are doing a git clone of your code. As things currently stand changing the application won’t trigger any form of change to the EC2 instance, meaning the instance won’t get recreated & a new version of the app started.

It sounds like you are wanting a full continuous deployment flow (rather than deciding a version of the app to deploy). You therefore need to add something to your Terraform code to know a bit more about the application so that a rebuild of the EC2 instance can be triggered.

I’d suggest something like the following:

  1. Change your install.sh to be a template, rather than using file() to include it directly.
  2. Add a template variable within the script which adds a hash identifying the current code version (can be just a comment in the script)
  3. Pass that hash into the template (using templatefile()) from the github_branch data source (https://registry.terraform.io/providers/integrations/github/latest/docs/data-sources/branch). You’d add that data source to expose the SHA1 hash of the commit at the HEAD of your main/master branch.

What this would do would be to update the install script any time the HEAD commit changed (as there would be a different SHA1 hash). As the script is passed into the EC2 instance via user_data this would trigger a change to the instance, in this case causing the box to be destroyed & recreated. Upon startup your script then fetches the latest code.

If you do want Terraform to react to changes in the GitHub repository then you can in principle use data sources from the integrations/github provider to read commit ID information directly from the GitHub repository, as long as the Terraform process and its child processes for plugins have access to GitHub credentials. Since you are intending to use GitHub Actions I assume that will be true.

This does have some hazards, though: Terraform is not intended to be a “build” tool and so it doesn’t have the typical workflow of a build tool where each run produces a new artifact and publishes it alongside other artifacts. Instead, Terraform manages individual long-lived objects and will either update them in-place or replace them (depending on the capabilities of the underlying API) in response to changes. In practice that means that e.g. if you add a commit with a bug then it’s likely that Terraform will already have destroyed the previously-working EC2 instance before you notice that the new one isn’t working, leaving you with nothing running.

Typically I recommend a two-stage process where there’s a build step, implemented outside of Terraform, which produces an artifact and publishes it somewhere that the deploy step can later retrieve it. In your case, the build step might be just to clone the repository and run npm pack to produce an archive file as the artifact.

The “deploy” step will then take an identifier for an artifact to deploy as its input. If you want to implement the deploy step using Terraform then that would mean that you’d declare an input variable for the Terraform configuration and then make your deploy step automation pass the artifact identifier into it.

I’m not super familiar with GitHub Actions but I think one way to achieve this sort of model in that environment would be for your build step to be a workflow which responds to the “push” event for your main branch, and then the deploy step to be a separate workflow which uses the “workflow_dispatch” event to allow explicitly triggering a deployment, taking the job ID from the other workflow as an input. The second workflow would then fetch the build artifact from the given job and use it to deploy.

An advantage of this separation is that your build artifacts live separately from the EC2 instances that will run them, and so if you need to you can always re-run the deployment step with an earlier job ID to revert back to that artifact. If you are using Terraform for the deploy step then the important requirement would be that something in the EC2 instance configuration is derived from the given build artifact, so that Terraform knows the EC2 instance must be replaced each time the artifact changes.

The details of how to achieve that for a specific case are a bit long to get into immediately here, but if you have specific questions about it I’d be happy to try to answer them. Notice though that I said if your deployment step is Terraform; it’s also valid, as @stuart-c noted, to use Terraform just to create the long-lived foundational infrastructure that will survive across many deployments and then use some other solution for deploying new code onto servers. For example, HashiCorp Waypoint is a tool focused specifically at the problem of building and deploying software into pre-existing platform infrastructure, although typical use of it requires a little more foundational infrastructure than just a bare EC2 instance.

Thank you both for your thorough replies! Not destroying EC2 instances for every merge to the main branch sounds like an approach I want to dive deeper into. I have to find out how I can make for example ansible/Waypoint discover the new server if I decide to make another one in Terraform.

I’ll be doing some more cloud stuff around christmas - I’ll definitely let you know if I have some questions that are hard to google @apparentlymart. It’s gonna be a long journey discovering the scope and limitations of different programs, I’m dipping my toes into docker as well.