Inter-module dependency on null_resource forces TF to replace all resources without any changes in the code

vmnomad · February 8, 2022, 10:29am

Hi there,

I am having a weird issue with TF 0.13.7

We have 4 local modules:

Module_A
Module_B
Module_C
Module_X

Module_X amongst many other resources has null_resource.register_providers with local exec to run a bash script. The script registers multiple Azure Resource Providers.

resource "null_resource" "register_providers" {
  triggers = {
    list_of_providers  = local.list_of_providers
  }

  provisioner "local-exec" {
    environment = {
      CSV_OF_PROVIDERS = local.list_of_providers
      API_VERSION      = local.api_version
    }

    interpreter = ["/bin/bash", "-c"]
    command     = file("${path.module}/files/register_provider.sh")
  }
}

No idea why azurerm_resource_provider_registration wasn’t used. I joined this team only a week ago.

Module_X also has an output for null_resource.register_providers

output "register_providers" {
  value = null_resource.register_providers
}

in the main TF folder we call all 4 modules, but the null_resource.register_providers in Module_X has to be completed before other three modules are run. Hence, the code in the main TF folder calling Modules A,B and C have

module "module_a" {
  source = "../../../modules/module_a"

...
...
...

  depends_on = [
    module.module_x.register_providers,
  ]
}

This config works just fine for the first run, however, on the consequtive runs TF want to replace all resources in Modules A,B and C even though there are no changes to the code.

After playing a bit I figured that it all comes down to null_resource. Just for a test I replaced it with some other resources in the output of Module_X and TF stopped trying to replace all objects in Modules A, B and C.

The dependency between modules is actually required for the first run only. The current workaround is to remove this dependency from the code manually after all resources are provisioned, but I am after a proper solution.

Any ideas how to fix it?

jbardin · February 8, 2022, 1:55pm

Hi @vmnomad,

Without a complete example I can’t explain exactly how the change is being triggered, but the common cause is the use of depends_on with a data source. If you specify that a data source depends_on a managed resource, that data source cannot be read until any pending changes in the managed resource have been resolved. Adding depends_on to a module means that everything within that module depends on the referenced value.

The solution is to remove the blanket depends_on statement, and assign the dependencies only where they are required. If there is no explicit assignment possible, this may mean setting depends_on only on the specific resource which needs it within the module.

Another guess here is from the clue that the depends_on is only required for the first apply. This usually hints at a managed resource and data source representing the same logical resource within a single configuration. If that’s the case the solution is to remove the data source and pass the managed resource value directly to the dependencies which need it.

apparentlymart · February 9, 2022, 12:58am

I think an extra clue here is the fact that this null_resource resource’s provisioner seems to be creating a remote object in a way that would more typically be done with a resource block. As @vmnomad noticed, it’s a little strange to be creating an object like that when there is a resource type available for that same object type in the provider; my guess would be that the resource type was added to the provider at some later point and this provisioner approach was a workaround to avoid waiting for a new provider release.

Given that, I wonder if there’s a data block inside that module which tries to look up the object that the provisioner created, in order to use it as if it was a normal Terraform resource. Unfortunately, that can then fall victim to the problem @jbardin described where the configuration is telling Terraform to defer reading the data resource until the apply step, which in turn causes downstream resources to need to be replaced.

My suggestion in that case would be to try to replace the provisioner here with a real azurerm_resource_provider_registration resource and then return whatever attributes of that resource are required by the other module, so that you can pass them across to achieve a module composition design:

resource "azurerm_resource_provider_registration" "example" {
  # ...
}

output "provider_name" {
  value = azurerm_resource_provider_registration.example.name
}

module "module_x" {
  source = "../../../modules/module_x"

  # ...
}

module "module_a" {
  source = "../../../modules/module_a"

  # ...
  provider_name = module.module.x.provider_name
}

The goal here then would be that this module_a just takes the provider name determined by the other module and uses it directly, rather than attempting to look it up again using a data resource. That should then allow the configuration to converge in a stable state, once that registration is created and its name attribute (used by other resources in the other module, presumably) remains stable.

Topic		Replies	Views
Null_resource with a destroy provisioner being destroyed for every trigger change Terraform	1	2438	February 10, 2021
Null resources depends_on doesn't work on replacement Terraform	6	12673	August 2, 2021
Help with local provisioner resource and modules Terraform	0	353	February 10, 2020
Terraform Modules not working - missing required argument Terraform	0	2568	June 9, 2021
Can't use depends_on in modules Terraform	3	1300	October 14, 2020

Inter-module dependency on null_resource forces TF to replace all resources without any changes in the code

Related topics