is there any good mechanism to source the values for a root module’s input variables from a database? Or any caveats one should take into account when trying such approach?
We’re running a data mesh platform where we use Terraform to deploy standardized data product infrastructure. The components of a data product’s infrastructure are defined via modules, that we call from a root module that encompasses all data mesh infrastructure.
Currently, the data products are declared via a map-type variable that we currently feed into the root module from a .tfvars file. We’d like to move away from this approach to sourcing this structured input from a database, but there doesn’t seem to be a native way to do this (for what it’s worth, I suspect the Consul-Terraform-Sync does a similar thing behind the scenes).
The root module input variables always come from outside of the root module, and therefore outside of your Terraform configuration altogether.
That means that if you want to set those variable values programmatically then you will need to do so from outside of Terraform, such as by generating a .tfvars or .tfvars.json file based on what you find in your database before you run Terraform.
That is indeed the strategy used by Consul-Terraform-Sync: it generates a .tfvars file containing the information retrieved from the Consul catalog, and then passes that to Terraform when it plans and applies the selected module.
If you’d rather do this dynamically at runtime, without pregenerating any files, you’ll need to introduce an extra level of module which can contain the rules Terraform should follow to retrieve the data and prepare it to pass into what was previously your root module.
Since your database is presumably using a custom schema, you’ll need to write some custom code to query it. There are two main options for that:
You could write your own Terraform provider that is designed to work directly with your database, or with an API layer you’ve wrapped around your database. This provider would offer a data source that returns all of the data you need in whatever structure is most convenient.
This is the most flexible solution, but the official Terraform development libraries are written in Go so it’ll be difficult to do this in any language other than Go.
You could use the hashicorp/external provider’s external data source to have Terraform execute an external program to retrieve the data.
That data source only knows how to return a map of strings, so you’ll need to decide some way to package the structured data fetched from the database into one or more strings. One straightforward answer would be to have your external program return a JSON object with a single property whose value is a string containing a nested JSON serialization of the data, and then use Terraform’s jsondecode function to decode it into a live data structure again.
Your new root module would then consist of a call to the data source that is fetching the data (either of the two options above) and then a module block calling what was previously your root module, but is now a child module:
data "custom_example_thing" "example" {
# ...
}
module "infrastructure" {
source = "./infrastructure"
# This can be a direct assignment if you write your
# own provider and make sure it returns a type
# that matches what your module expects.
# If you use the "external" data source then you
# will probably need to do some wrangling here
# to transform the encoded value into what the
# module expects.
products = data.custom_example_thing.example.products
}
I was indeed considering the first option you listed for being kind of more Terraform-native, but was hesitant due to how unpredictable it is sometimes to iterate a resource/module over a data source.
It does seem externally generating a terraform.tfvars or terraform.tfvars.json warrants a more stable and decoupled way to extend Terraform towards a service catalog of sorts, so very likely this is the way we’ll pursue.