Module dependency causes datasource to be read during apply

deasunk · September 26, 2022, 3:21pm

The module dependency below causes a datasource in the king module to be read during apply?
However, the datasource does not use any of the outputs from elvis.

If I remove the depends_on the datasource is read during terraform plan.

module "elvis" {
  source = "./modules/elvis"
}

module "king" {
  source = "./modules/king"

  depends_on = [module.elvis] # Causes a datasources in king module to "read during apply"
}

stuart-c · September 26, 2022, 6:52pm

The depends_on is saying that everything within the king module depends on the elvis module.

In general depends_on shouldn’t be used. What is the reason for that being set? Terraform should be able to determine dependencies by seeing how outputs and inputs to resources are linked.

apparentlymart · September 26, 2022, 11:38pm

Indeed… depends_on is a very imprecise way of specifying dependencies, and it will typically generate a far more conservative plan than would be possible with normal reference-based dependencies.

Terraform behaves in this way because depends_on exists to describe so-called “hidden dependencies”, where a remote system has some ordering relationship between two objects that isn’t evident from the references between them. One example is that an AWS Lambda function depends on an IAM role for the function to act as but does not depend on any policies attached to that role that make it actually be able to make requests, and so the Lambda function has a hidden dependency on the IAM role policy which must be described using depends_on.

However, even then I would recommend specifying the dependency more precisely than over an entire module. For example, if the lambda function and its IAM role were encapsulated into a module then the module might return some information about the Lambda function with an associated explicit dependency on the role policy to ensure that nothing downstream tries to make use of the Lambda function until it’s fully ready:

output "lambda_function_arn" {
  value      = aws_lambda_function.example.arn
  depends_on = [aws_iam_role_policy.example]
}

Using depends_on in a module block is the worst case for conservative behavior because, as @stuart-c said, you’re telling Terraform that everything in module "king" must be dealt with only after everything in module "elvis" has been applied, which includes any data blocks. data blocks must be included because they may be trying to read a value that would be created or updated by something in module "elvis" due to a hidden dependency, and you haven’t given Terraform enough information to know what exactly depends on what so it creates dependency relationships for all of the objects in the two modules.

It’s very rare that it should be necessary to set depends_on in a module block, and it’s almost always better not to do it. If you can say more about why these modules depend on one another – ideally using some real resource type names rather than these placeholder module names – we can hopefully show a more precise way to describe the dependency relationships so that you won’t need to set depends_on for the entire module.

deasunk · September 27, 2022, 10:54am

@apparentlymart I believe my requirement is the same.

So for example below. aks module sets secrets in Azure keyvault. It only needs the key vault id in order to add the secret.
But it depends on the azurerm_key_vault_access_policy being set also (hidden dependency)

Is this corrct approach?
Instead of depends_on I should add a input variables for key_vault_id & key_access_policy_id.
Even though it doesn’t use key_access_policy_id. The aks module should output key_access_policy_id to handle the hidden dependency?

module "keyvault" {
  source = "./modules/keyvault"

  # resources: azurerm_key_vault, azurerm_key_vault_access_policy
  # outputs: key vault id & access policy id
}
 
module "aks" {
  source = "./modules/aks"
  # resources: azurerm_key_vault_secret

  key_vault_id                = module.keyvault.id
  key_access_policy_id = module.keyvault.access_policy_id

  # output var.key_access_policy_id

}

apparentlymart · September 27, 2022, 3:19pm

From what you described it seems like the abstraction being offered by your key vault module is a key vault that is configured with the necessary access for users of that vault to be able to immediately use it.

If so, I think the best encapsulation would be for the key vault module to return the key vault ID with an explicit dependency on the policy, which means that the key vault ID won’t be available to callers of the module until the policy has also been configured.

For example:

output "id" {
  value = azurerm_key_vault.example.id
  depends_on = [azurerm_key_vault_access_policy.example]
}

This approach should achieve the correct behavior without the calling module needing to be aware of the access policy object. As far as the caller is concerned there is only a key vault ID which is guaranteed to already have the required policy without the caller needing to do anything special beyond just passing that key vault ID wherever it is needed.

Topic		Replies	Views
Terraform Data block dependency AWS	1	818	July 27, 2021
Dependent module and data source Terraform	3	1312	December 8, 2020
Terraform plan changes based on existence of depends_on in module Terraform	7	5874	December 22, 2021
Question about module and resource dependency Terraform	5	920	November 17, 2023
Module depends_on modifies the called module Terraform	9	2405	December 8, 2022

Module dependency causes datasource to be read during apply

Related topics