Terraform plan want to replace all components when a map of resources passed to a module has changed

When the output of a module changes and then is used as a variable in another module, all object_ids of pre-existing resources in the output become unknown. This causes resources made by the second module to be force replaced (destroyed and created again with exactly the same properties).

I made a issue on GitHub about this. (Terraform plan want to replace all components when a map of resources passed to a module has changed · Issue #31984 · hashicorp/terraform · GitHub) but I was told to ask here. A code example of the problem can be found in the GitHub issue.

The problem occurs when creating VMs with accompanying auto shutdown schedules using the azurerm provider. When new VM resources are created (and thus the output of our VM-module changes), all shutdown schedules (including ones that are linked to pre-existing VMs), are replaced. This happans because module for auto shutdown schedules uses the output of the VM-module and TF thinks this whole output map is unknown. I think it should only mark the new part of the map unknown.

Please ask away if you need more information. As stated in the GitHub issue, we have been troubleshooting this very extensively. Removing depends_on relations between the modules has not been the solution for us. This helps of you use data sources, but apparently not if you use output of a module as input for another module.

Thanks in advance!

Hi @bongersjb,

I can’t see everything in the config from the snippets in the linked issue, but maybe there’s enough to walk back from the unexpected change to find the problem.

We know from the plan output that the change to

azurerm_dev_test_global_vm_shutdown_schedule.template.virtual_machine_id 

is forcing replacement, because the planned value is unknown.

Looking at the configuration, that attribute is assigned the expression

var.linux_virtual_machines_data[each.value.linux_virtual_machine_key].id

It appears that the values for var.linux_virtual_machines_data are statically defined, so we can assume those are fully known, and the problem is with each.value.linux_virtual_machine_key.

Inspecting the for_each expression shows at least one problem:

for_each = try(var.dev_test_global_vm_shutdown_schedules_data, {})

There is no reason for the try() function here, since the given expression cannot fail. All it could do is cause more values to become unknown. I would see if removing try() fixes the problem.

Hi @jbardin,

Thanks for continuing to help me with my issue, even here on the community forum.

We use the try() function to also check for a windows_virtual_machine_key, as we use this module in combination with our windows_virtual_machines module (didn’t share this module because our test was done with linux_virtual_machines) as well.
That’s why the try() function is still there in the example I shared, because we actually use that function to get the value of a virtual_machine_id based on either a windows_virtual_machine_key or a linux_virtual_machine_key.
That line, in our actual codebase looks like this:

virtual_machine_id    = try(var.windows_virtual_machines_data[each.value.windows_virtual_machine_key].id, var.linux_virtual_machines_data[each.value.linux_virtual_machine_key].id)

As it turns out, removing the try() function actually solves our problem. The thing is though, we want to use the dev_test_global_vm_shutdown_schedule module for both Windows and Linux VMs.

Is there an alternative way to accomplish this without the try() function? It was my understanding that the try() function decided on what the virtual_machine_id will be and the value would be known before the apply. Turns out that that’s not true. All values needed are there before the apply, right? So why does Terraform mark the value as “unknown”? And further, why does the try() make any difference here?
Both values between the round brackets of the function are known:
var.windows_virtual_machines_data[each.value.windows_virtual_machine_key].id = null

var.linux_virtual_machines_data[each.value.linux_virtual_machine_key].id = the value we actually need here.
So try() takes the second value and the value is known. Right?

Please correct me if I’m wrong!

Thanks again!
Jan

The try() and can() functions are a bit special, because they need to attempt to determine if the given expression will fail to evaluate before we fully evaluate it. If part of the expression is unknown, it is attempting to be conservative about whether that expression might later fail once the values are fully known, because it must return a consistent value in both cases.

This happens to be a known issue which needs more investigation, and you can see some more explanation here.

I agree that try might be more convenient in some cases, especially when the object type is not known. With the limited example, I’m not sure I understand the use case enough to offer an alternative, but I have a feeling there is a way to accomplish the same thing by declaring concrete types so that attributes are always valid, and conditionals to check for empty containers or null values.

Hi again @jbardin,

We tried to use the can() function instead of the try() function.
A colleague of mine had seen that Microsoft uses this as an alternative to try() in their CAF-modules (terraform-azurerm-caf/backup_vaults.tf at d8fa284cecf4798e3a7dfaa2c28cad30a9aa700a · aztfmod/terraform-azurerm-caf · GitHub) for example.

Using

virtual_machine_id    = can(var.windows_virtual_machines_data[each.value.windows_virtual_machine_key].id)? var.windows_virtual_machines_data[each.value.windows_virtual_machine_key].id : var.linux_virtual_machines_data[each.value.linux_virtual_machine_key].id

Worked for us.

Thanks for your help!