Help with developing and using a terraform module I published to the Terraform Registry

briancaffey · March 8, 2022, 9:01pm

Hey everyone! I’m working on developing a set of terraform modules that I can use for deploying web applications for different environments (ad-hoc per developer, dev, qa, rc, stage, prod, demo, etc.)

I originally learned a lot about Infrastructure as Code by using CloudFormation and then CDK, and I have published a CDK construct library that I can use for the same purpose of deploying web applications, here’s the link: https://www.npmjs.com/package/django-cdk. I’m basically trying to create something similar with Terraform.

I started working on a terraform configuration that lives in the same repo as my Django + Vue.js application mono repo using one big main.tf file, then broke that into modules and refactored it to find the right abstraction and organization for the modules, and that has been working really well. For the next step in abstraction, I tried creating a new repo called terraform-aws-django and published it to the AWS registry (link).

I’m now having trouble trying to use this module in a live repo that will be a 1:1 mapping of what I have deployed in my AWS accounts.

In the Terraform Up and Running book (version 1, which I’m realizing is quite old at this point), I read that the most DRY way to do this is to have only *.tfvars* files in the live repo and use a special sourceparameter in the*.tfvars` files that points to a version git repo where my modules live. Was this an old feature of Terraform that is now removed? I asked about this in more detail in this StackOverflow question and I was told that this is not possible.

This seemed like a great way to organize my live terraform repo when I read the book, but after asking that question on SO and looking at the updated version 2 of that book, I see that this is no longer supported. In the new and updated book, there is instead a recommendation to use terragrunt and *.hcl files, but I don’t want to use another wrapper/tool since I’m still trying to figure out a way to do everything with only Terraform.

What I’m trying to do now seems like it might work, but it also feels like it will involve lots of duplicated code where I was hoping to keep things simpler. In my live repo I have a folder per environment with five files:

main.tf → this just calls my module from the terraform registry (I’ll call it app here) using source and version and parameters are provided with parameters on the module (param = var.my_param)
variables.tf → this defines all of the parameters that I’m passing to the single module I want to call from the root module (this is a copy of what I have in my root level module’s code that I’m calling in the child module)
outputs.tf → The module has some outputs that I need to use in a CI/CD pipeline. I now need to define a new set of outputs on the parent module that repeat the outputs on the root level of the child module (again this seems to me like the wrong way)
providers.tf → I originally defined a providers.tf on the root level of my app module, but this won’t work if I am calling the module as a child module, since the terraform block needs to be defined in the root module.
env.tfvars → this is a file that I use to define the inputs for the live environment. Again I was originally hoping to use just this one file and define source parameter in it as described in the last chapter of v1 of the Terraform Up and Running book, but I don’t think this is going to be possible. I could then have any number of other environments defined by other other-env.tfvars files and then point to the folder when I do terraform init and terraform apply in my pipelines.

My main goal here is to use a remote module as the root module for a “live” terraform configuration that I can define minimally with a single *.tfvars file (if this is even possible).

Thanks for having a look at my question, I’m eager to get the best and most DRY patterns in place for my practice terraform repo so I can get on to the other parts of my CI/CD pipeline that will automate the terraform init/plan/apply. If there is any other information or details I can share, I would be more than happy to do so!

apparentlymart · March 8, 2022, 11:18pm

Hi @briancaffey,

I’m the same Martin Atkins who answered you on Stack Overflow.

Terraform has never supported using .tfvars files for anything other than specifying root module variables and so seeing you again mention this being claimed in the first edition of Terraform Up and Running piqued my curiosity here.

With the help of a coworker who has a copy of the book we tracked down the content we think you’re referring to. It seems to be on page 167, in a chapter called “Workflow”. I’m going to quote just the relevant content here for context and hope that the author won’t mind since it’s from an obsolete edition of the book anyway, and I believe this advice is no longer current:

Notice how there are no Terraform configurations (*.tf files) in the live repo. Instead, each .tfvars file specifies where its configurations live using a special parameter. For example, to deploy the frontend-app module in the production environment, you might have the following settings in us-east-1/prod/frontend-app.tfvars:
source = "git::git@github.com:foo/modules.git//frontend-app?ref=v0.0.3"

aws_region = "us-east-1"
environment_name = "prod"
frontend_app_instance_type = "m4.large"
frontend_app_instance_count = 10

[snip: another example for staging, with slightly different settings but same source]

Both .tfvars files specify the location of their Terraform configurations using the source parameter, which can specify either a local file path or a versioned Git URL. The .tfvars files also define values for every variable in those Terraform configurations.

To do a deployment, you can create a script that takes the path to a .tfvars file as an input and does the following⁵:

Run terraform init to check out the modules repo from the URL specified in the source parameter of the .tfvars file.

Run terraform apply -var-file <TF_VARS_PATH>, where TF_VARS_PATH is the path to the .tfvars file.

⁵ Terragrunt has support built-in for this workflow.

It took me a few reads to follow exactly what this text is describing, but I think the crucial words are in the paragraph just before the final list of steps: “you can create a script”. This section is proposing that you write your own wrapper around Terraform which parses this pseudo-.tfvars file to obtain the source value (I say “pseudo” because from Terraform’s perspective it has an invalid extra entry) and then runs Terraform as a child process in the manner indicated.

The Terraform that was current at the time of this edition’s publication would silently ignore invalid entries in the .tfvars file, and so I think the intention here is that the wrapper would then pass the same .tfvars file to that terraform apply command so that Terraform could see all of the other (non-source) values in there to populate the variables, after silently ignoring the invalid source argument.

The footnote suggests that this text was recommending using Terragrunt as that script, and although I wasn’t super familiar with Terragrunt’s behavior at this time I do recall that it was designed quite differently when intended for use with older versions of Terraform, and so it makes sense to me that there was some older version of Terragrunt that would understand this “pseudo-.tfvars” file and run Terraform in the way the text describes.

This behavior was not a part of Terraform itself though, and I believe today’s Terragrunt achieves a similar result in a different way, using its own terragrunt.hcl language.

With all of that said then, you mentioned you don’t want to use Terragrunt and so I think my advice to you here is the same as it was in my Stack Overflow answer. A Terraform configuration containing just a single .tf file is the smallest possible unit of Terraform execution, and that file can consist primarily of calls to one or more external modules if you’d like.

For example, if you placed the following in a main.tf file then you’d have the equivalent of the prod/frontend-app.tfvars example I quoted from the book above:

module "main" {
  source = "git::git@github.com:foo/modules.git//frontend-app?ref=v0.0.3"

  aws_region                  = "us-east-1"
  environment_name            = "prod"
  frontend_app_instance_type  = "m4.large"
  frontend_app_instance_count = 10
}

A shared Terraform module is typically not entirely self-sufficient in this way though. I’m not sure if this old version of Terragrunt was doing something additional here or if there were some items in that module that papered over this, but a root Terraform module should typically also include configurations for the providers it will use and a backend configuration to specify where the state will be stored, and so a more realistic main.tf file for someone using AWS might look like this:

terraform {
  backend "s3" {
    # (S3 backend configuration)
  }

  required_providers {
    aws = {
      source = "hashicorp/aws"
    }
  }
}

provider "aws" {
  region = "us-east-1"
}

module "main" {
  source = "git::git@github.com:foo/modules.git//frontend-app?ref=v0.0.3"

  environment_name            = "prod"
  frontend_app_instance_type  = "m4.large"
  frontend_app_instance_count = 10
}

If your goal were to use these different configurations to represent different deployment stages / environments then each environment would have its own version of this main.tf specifying each environment’s specific settings, but all of the real declarations would be inside the shared module(s) called by these root modules, thus avoiding the duplication of those common declarations.

apparentlymart · March 8, 2022, 11:23pm

FWIW, after reviewing some of the discussions I had with the Terragrunt authors during the Terraform v0.12 development period I remembered the crucial detail that I think explains what the first edition of the book was describing:

The original Terragrunt design was for it to not have its own separate configuration file but instead to overload the .tfvars format with additional arguments which it would parse, and then it would pass that .tfvars-with-extra-stuff file to Terraform CLI under the assumption that it would silently ignore unexpected arguments.

Terraform v0.12 introduced what was originally an error and then later became a warning when it detected unexpected arguments inside a .tfvars file, because previously users had been frustrated by a total lack of feedback if they made a typo of a variable name. In response to that, Terragrunt switched to using its own separate language in terragrunt.hcl files, to give Terraform back exclusive control of its .tfvars file format.

So all of this is to say: the advice in the book is talking about using an obsolete version of Terragrunt with an obsolete version of Terraform, and so it’s true that what is being described here never worked with Terraform alone but it did at one point work with Terragrunt. Terragrunt now achieves a similar result in a different way using its own configuration language, and so I assume later editions of the book describe how to achieve a similar effect with the Terragrunt configuration language.

briancaffey · March 13, 2022, 2:20pm

Thanks so much for your reply @apparentlymart , this is making a lot of sense now and I see how I missed the point of what was being explained in the book about using only *.tfvars files. I added an examples directory to my module which has used the pattern that you described.

I did like the idea that was described in that book, but I see that it is not valid given how terraform modules are designed to work.

My external module on the terraform registry has several required and option variables as well outputs that I need, and I’m finding that I have to copy all of these over into the new “live” module where I am using this external module, which isn’t too bad I guess. Once I set up these variables on the “live” module, I can define several *.tfvars files that I can use to define multiple environments (for example ad-hoc environments for different developers, teams, stages of the SDLC, etc.).