Reduce API calls for Data Sources to improve apply performance

Hi,

I have more of a best practise question related to data sources. We are heavily using data sources like below in our TF modules instead of passing the value through variables.

e.g.,

data "aws_partition" "current" {}
data "aws_caller_identity" "current" {}
data "aws_region" "current" {}
data "aws_iam_session_context" "current" {}

Sometimes i have to use the output value of these data source in multiple places in the same sub module.

My concern is that does it make a call to the data source AWS API for every occurrence that i use in the code? Is there a caching mechanism?

e.g., getting current partition data.aws_partition.current.partition used multiple places in the module

Will it reduce the number of calls if i store the data source output in local variable and use the local variable across the module.

e.g.,

locals
{
  aws_current_partition = data.aws_partition.current.partition
}

and then use this everywhere in the code

local.aws_current_partition

Please advise the best way to reduce the number of calls to improve the performance of PLAN and APPLY.

Overall does Hashicorp has any DONTS document on Terraform which you can share. We will would like to improve the performance of our code by avoid making unnecessary calls

Thanks in Advance!

Hi @vara-bonthu,

Each data block is independent of the others and so will typically make its own API call.

The main way to avoid redundantly re-fetching the same information is to make only one module (which could be the root module itself) responsible for loading the data and then pass it between modules using input variables and output values.

I see in your question that you are fetching the data multiple times specifically to avoid passing it between modules, but the tradeoff of doing that is that each module must therefore do its own call, creating the situation you’ve observed here of fetching the same data multiple times.

2 Likes

Thanks @apparentlymart for the clarification. I think we will use the approach of defining these data sources in the root module and pass it as variables to sub modules :+1: