Terraform remote state outputs excavation

Hey all,

I am struggling with Terraform which is using the remote state (s3) to output the variables which I need the input for (ECS Fargate). The problem is that AWS ECS does not split the infrastructure (VPC subnets, security groups) from the application/platform layer, which makes it really awkward to use.

I have to do terraform state pull and then

for subnet in $(terraform state list | grep "aws_subnet.private"); do 
terraform state show $subnet | grep ... 
done 

or use jq to fetch the necessary information out. I wonder, is there a better way of doing this?

Hi @PeterBocan,

When you need information generated by applying a Terraform configuration from another program after Terraform has finished making changes, the most typical answer is to declare an Output Value in your main module that returns the necessary data in a form that’s hopefully convenient to use.

By default Terraform will show the output values from the main module as part of the output of terraform apply. If you want to use those values programmatically you can use the terraform output command with either the -json or -raw option to retrieve the value in a machine-readable format.

I’m not sure from your question exactly what information you need, but if it’s data that is exported as attributes of the resources you have declared then you should be able to write an output value expression to take that data and transform it into a suitable form for your downstream use.

Hi @apparentlymart , thanks for the response! The terraform output does not seem to work with the remote state (I did not use terraform_remote_state data source, as the use of it is a bit unclear to me).

I do have output variables however they are within the submodule which sets up the VPC, etc. and not the module which instantiates this submodule.

To clarify, I have the following structure:

modules/
   - web/ 
     - alb.tf
     - vpc.tf
     - variables.tf
     - outputs.tf
     - ecs.tf
envs/
   - dev/
   - demo/
     -  main.tf (refers to the versioned "web" module above)

Everything is stored in a Git repo. States are stored in S3 and DynamoDB table for locking.

Hi @PeterBocan,

I assume from what you’ve stated here that you have written your root module with a backend "s3" block that specifies to store state in an S3 bucket.

If that’s true then terraform output should honor that setting. That command essentially performs the following steps:

  • Fetch the latest state snapshot from the configured storage location, as chosen by your backend block.
  • Retrieve either the full set of root module output values or the single selected output value from the “outputs” section of the state snapshot.
  • Marshal the value into a suitable form to write to the command’s stdout, depending on which options you specified:
    • No options at all means the human-oriented presentation, similar to what appears at the end of terraform apply.
    • -json produces either a JSON object describing all of the values or a single JSON value representing the single selected value.
    • -raw requires that you’ve selected a single value and requires it to be something that can convert to type string. After converting to string it then just writes that string directly to stdout with no other formatting.

With all of that said, if this isn’t working for you then I’d like to learn more about exactly what commands you are running and exactly what output they are producing. Then I might be able to explain why this isn’t working as I described above.

no :frowning: I did not create a root module. I thought that would be a bit problematic if it all was in a one big state. So I created s3 states for different envs + a state which contains the corp (shared) stuff.

Hi @PeterBocan,

From what you’ve described it sounds like each of your environments has its own root module and your “corp” category also has its own root module.

Each of those can have its own output values, but I think perhaps you are asking about how best to share data between them? For example, perhaps each of your environment configurations needs to make use of objects declared in the “corp” configuration. Is that what you are asking about?

Yes, I thought having separate root modules is the way to go. I am not sure if having one big terraform state which contains the entire infrastructure is a wise thing to do.

Thanks for helping me to understand your goals.

When decomposing a system into multiple separate Terraform configurations, the typical strategy for describing dependencies between your components is to use data resources, declared using data blocks. As far as Terraform Core is concerned all data sources are equal, but in practice there are three main techniques to choose from that each have some different tradeoffs of explicit vs. implicit and tight vs. loose coupling.

loose coupling tight coupling
explicit (none) remote state outputs
implicit direct data sources indirect data sources
  • Indirect data sources: in the “producer” configuration, use a resource block for managing an explicit configuration-sharing object, like an AWS SSM Parameter, and then in the consumer configuration use a data block to retrieve that same object.

    This is loose coupling because the two configurations only need to agree on a common location for the data to be placed. Future refactoring might change which subsystem is the “producer” but the consumers don’t need to change as long as the new producer writes equivalent data to the same location.

  • Direct data sources: the producer configuration just declares some objects using resource blocks, typically using some informal strategies for marking those objects to distinguish from other objects of the same type. The consumer uses data blocks to directly retrieve those objects, often by relying on the same tagging scheme.

    This is “implicit” because the consumer is relying on some details of how the producer is implemented. But it’s loose coupled because refactoring could change which subsystem is the consumer in future as long as the relevant implementation details don’t change.

  • remote state outputs means using the terraform_remote_state or tfe_outputs data sources to directly retrieve the output values exported by the producer configuration.

    This is explicit because the producer decides exactly what to export and how to export it, and then the consumer makes use of that API. But it’s tightly coupled because the consumers must all know exactly which subsystem is the producer; if you refactor in future you might need to update all of the consumers to get their data from the new producer.

All of these are valid approaches for different situations. I’ve written them above in my preference order in case the differences don’t seem significant for your situation: I prefer explicit and loose coupling because it gives the most freedom for future refactoring and helps future maintainers to understand the system by reading the producer and consumer configurations.

I hope that helps! If this is too general, please share more specifics about your goals.

Hey thanks for the breakdown, very interesting read!

Well I have two big parts to configure:

a) corporate infrastructure - completely separate from the product itself
b) product platform which is modularised and might be used in multiple environments. I stuck with using a versioning of the modules. It’s a bit clunky but it works. I can nicely separate environments depending on different versions of modules.

It makes sense what are you saying, I think I will need to utilise the data blocks more often. I am new to terraform, so I am a little bit cautious what I use and how I use it.