Using terraform_remote_state over common data filters

Hi @fishi0x01!

The approach we’ve generally recommended as the ideal case is actually a third option 3: explicitly write results into a configuration store or data store and then have other configurations read it.

Doing that requires having a suitable configuration store deployed, though. The requirement is to have some place to store data that Terraform can both write to and read from. I’m not familiar with OpenStack so I’m not sure if it has a suitable store, but some examples of such a store in other systems are AWS SSM Parameter Store, and HashiCorp Consul. I’m going to use AWS SSM Parameter Store here just for the sake of example.

In the “producer” configuration, we can use the aws_ssm_parameter resource type to explicitly publish a specific value to be consumed elsewhere.

resource "aws_ssm_parameter" "example" {
  name  = "vpc_id"
  type  = "String"
  value = aws_vpc.example.id
}

Then in other “consumer” configurations, we can use the corresponding data source to retrieve that value:

data "aws_ssm_parameter" "example" {
  name  = "vpc_id"
}

# and then use data.aws_ssm_parameter.example.value
# somewhere else.

This approach has two nice characteristics:

  • The publishing of information is explicit, so the intent is clear that this is a value intended to be consumed elsewhere, vs. just an implementation detail.
  • The “consumer” configurations using this value are decoupled from the “producer” configuration because in principle that value could’ve been written by any Terraform configuration, or possibly not even by Terraform at all. If you change system architecture in future, you can potentially change how that value gets populated without requiring changes to the consumers, because the configuration/data store serves as an indirection.

The two options you listed here are alternatives to this ideal in situations where you don’t have access to a configuration store. Each of those choices meets one of the nice characteristics above, but cannot meet both:

  • Option 1 (directly retrieving objects from the target system) achieves decoupling, but it’s not explicit about which objects are intended for external consumption and which are not. In a system with support for publishing and querying by custom tags you can approach explicit publishing through a standard tagging scheme, but you then need to ensure that everything in your system follows the tagging scheme properly.
  • Option 2 (terraform_remote_state) achieves explicit publishing, but suffers from close coupling: if you want that value to be managed by some other Terraform configuration or by another system entirely in future then you’ll need to change the consumers. Conversely, it’ll be hard (though not impossible) to consume that value from any system other than Terraform itself; generic configuration stores can more easily serve non-Terraform clients too.

If this so-called “ideal” approach isn’t viable in your environment then there is no single answer to which of the other options will be “best” in all cases; instead, you’ll need to make a tradeoff based on which of these two characteristics is more important in your situation. I hope that the above at least helps you consider the implications of each and come to a final decision!

2 Likes