Terraform_data and replace_triggered_by causing all instances of a resource to be replaced

thomas.spear · May 18, 2025, 4:31am

With the following configuration, I expected terraform would only replace a specific instance of a resource, but it seems to want to force replacement of all instances. Is there a method to only replace an instance with a matching key?

resource "terraform_data" "replace_on_node_count_change" {
  input = toset([ for key, value in local.worker_nodepool_configs : value.node_count ])
}

resource "azurerm_kubernetes_cluster_node_pool" "shared_worker_vmss" {
  for_each = toset(var.worker_nodepool_names)
  lifecycle {
    replace_triggered_by = [terraform_data.replace_on_node_count_change.input[each.key]]
  }
}

thomas.spear · May 18, 2025, 3:20pm

I’ve also tried other combinations including using triggers_replace and for_each in the terraform_data resource. I believe I’ve actually tried every combination conceivable to make a single instance of the node pool be replaced but everything I tried results in either all of the node pools being planned for replacement, or a single node pool getting updated in-place instead of being replaced.

It appears that when the terraform_data resource is updated, if I use for_each there, it updates all outputs and IDs even if only a single input value has changed. When I don’t use for_each and provide a map to input or triggers_replace, then reference the specific map entry in replace_triggered_by, the node pools still all get replaced.

thomas.spear · May 18, 2025, 4:12pm

Here’s another example, along with output.

Code:

resource "terraform_data" "replace_on_node_count_change" {
  for_each         = tomap({ for key, value in local.worker_nodepool_configs : key => value.node_count })
  triggers_replace = each.value
}

Output:

  # terraform_data.replace_on_node_count_change["bluez1"] must be replaced
-/+ resource "terraform_data" "replace_on_node_count_change" {
      ~ id               = "a482cac0-f8ca-7ba1-bc2a-b43218c026f6" -> (known after apply)
      + triggers_replace = 2
  }
  # terraform_data.replace_on_node_count_change["bluez2"] will be updated in-place
  ~ resource "terraform_data" "replace_on_node_count_change" {
      ~ id = "a58e1b99-55f3-e12d-b5ef-f1d2295612ba" -> (known after apply)
  }

The node count input for key bluez1 was changed from 1 to 2, so why does bluez2 terraform_data resource instance get updated in-place here? Its triggers_replace value didn’t change, and it doesn’t have an input value at all which would have caused its ID to change.

thomas.spear · May 19, 2025, 3:59am

The goal I’m trying to accomplish is a little complex to explain so I’ll do my best. I’m operating in Azure working on an AKS cluster in an environment where auto-scaling is currently disabled and we need to enable it. We have 4 node pools currently with one node in each (2 pools named green and blue for each of 2 zones; so greenzone1, greenzone2, bluezone1, and bluezone2) and we may add additional node pools in the future to accommodate larger nodes or other requirements, which may or may not have auto-scaling enabled.

I also made some additional changes to the disk configuration of the above node pools which caused all 4 node pools to be replaced at the same time and took down all of the workloads in the process. Fortunately this was in a sandbox environment and over a weekend, so no harm no foul, but this is something I want to avoid doing in production. So I reworked the configuration to allow for configuring all of the node pools the same, and then having a variable which can provide override configs for each node pool. This would allow me to perform the disk changes on one node pool at a time. This part works beautifully.

Where I run into an issue now is with auto-scaling itself. Azure recommends when enabling auto-scaling to ignore the node_count field. Since my node pools are configured by a for_each, the node_count field gets ignored on all pools. No big deal unless we are wanting to scale a node pool manually, or change the auto-scaling field from enabled to disabled.

When disabling auto-scaling, the node_count field becomes required, but when the resource already exists, the provider is doing an in-place update to disable auto-scaling, and since node_count is ignored, terraform throws an error that the node_count field is required. Because of this, I had thought to use replace_triggered_by along with terraform_data in the manner shown in my OP and previous comments in order to force replacement of a single node pool at a time when either changing the node_count field in the configuration, or when disabling auto-scaling.

Unfortunately, it seems like terraform wants to update all of the instances of the terraform_data resource whenever a change happens only to one instance, or one value within a map input on a single instance of the resource, which in turn causes all 4 node pools to also be replaced at the same time again, regardless of how I configure terraform_data or the replace_triggered_by field. This is definitely unexpected behavior on the part of the terraform_data resource, though the behavior on the node pool resource makes sense given the updates I’m seeing with terraform_data. I’m wondering if this is a bug that I should report or if this is working as-designed, and how I can accomplish what I’m trying to do.

thomas.spear · May 19, 2025, 5:51pm

I managed to come up with a workaround by abusing the terraform_data resource and having a duplicate definition of the node pool resource. It’s not pretty but it works for the moment.

resource "terraform_data" "auto_scaling_vmss" {
  input = azurerm_kubernetes_cluster_node_pool.auto_scaling_shared_worker_vmss
}

## The worker node pool configuration for node pools with auto-scaling
resource "azurerm_kubernetes_cluster_node_pool" "auto_scaling_shared_worker_vmss" {
  for_each = toset([ for pool_name, pool_config in local.worker_nodepool_configs : pool_name if pool_config.enable_auto_scaling ])

  ...
}

## The worker node pool configuration for node pools without auto-scaling
resource "azurerm_kubernetes_cluster_node_pool" "no_auto_scaling_shared_worker_vmss" {
  for_each = toset([
    for pool_name, pool_config in local.worker_nodepool_configs : pool_name if (
      !pool_config.enable_auto_scaling &&
      try(terraform_data.auto_scaling_vmss[pool_name], null) == null
    )
  ])

  ...
}

This could be done without the terraform_data resource, however the provider will try to do the destroy and the create simultaneously, which doesn’t work because the creation errors out since the resource already exists until the destroy completes.

The terraform_data resource references the auto-scaling VMSS which creates an implicit dependency, and thus it does not get updated until the destruction is complete. The for_each in the non-auto-scaling VMSS references the terraform_data resource and thus the non-auto-scaling VMSS gets created only after the destruction of the auto-scaling node pool is complete.

I’ve also tested switching auto-scaling off with the above configuration, and found that there is a side-effect which terraform reports is a bug in the terraform_data resource, but I’m not certain it is. The side-effect is that, the terraform_data resource generates an inconsistent final plan after destroying a node pool with auto-scaling disabled, and so running terraform apply causes an error about the inconsistency. The destroy happens, then the terraform_data resource tries to update but gets a mismatched output from the attributes of the instance of the node pool resource being destroyed and recreated in this way. Simply running terraform apply again completes the job.

jbardin · May 19, 2025, 8:08pm

Hi @thomas.spear,

It’s hard to follow what’s going on here since you bounced around a bit, but it would be easier to see the problem if we go back and complete a minimal example like you started with. There’s no reason you can’t use replace_triggered_by with an individual instance, but IIUC you probably want to setup a corresponding instance of terraform_data to trigger the change. The way replace_triggered_by works is that it needs to detect a change, but if all instances point to the same resource, they may all be seeing the same change.

Something like this (using another terraform_data as a proxy for azurerm_kubernetes_cluster_node_pool):

locals {
  test = {
    one = "first"
    two = "second"
    three = "third"
  }
}


resource "terraform_data" "replace_on_node_count_change" {
  for_each = local.test
  input = each.value
}

resource "terraform_data" "shared_worker_vmss" {
  for_each = local.test
  lifecycle {
    replace_triggered_by = [terraform_data.replace_on_node_count_change[each.key]]
  }
}

Here changing any of the local.test values will replace only the corresponding shared_worker_vmss.

thomas.spear · May 20, 2025, 2:25am

This is what I’ve described above, and it is not functioning properly for me.

Here’s a more recent attempt, including keys and values for locals (simplified from the actual logic for demonstration purposes):

locals {
  worker_nodepool_names = ["bluez1". "greenz1", "bluez2", "greenz2"]
  # Complex logic in worker_nodepool_configs has been simplified
  worker_nodepool_configs = {
    bluez1 = {
      enable_auto_scaling = true
      max_count           = 15
      min_count           = 1
      node_count          = null
    }
    bluez2 = {
      enable_auto_scaling = true
      max_count           = 15
      min_count           = 1
      node_count          = null
    }
    greenz1 = {
      enable_auto_scaling = true
      max_count           = 15
      min_count           = 1
      node_count          = null
    }
    greenz2 = {
      enable_auto_scaling = true
      max_count           = 15
      min_count           = 1
      node_count          = null
    }
}

resource "terraform_data" "replace_on_node_count_change" {
  for_each = tomap({ for pool_name, pool_config in local.worker_nodepool_configs : pool_name => pool_config.node_count })
  input = each.value
}

resource "azurerm_kubernetes_cluster_node_pool" "shared_worker_vmss" {
  for_each = toset(var.worker_nodepool_names)
  lifecycle {
    replace_triggered_by = [terraform_data.replace_on_node_count_change.input[each.key]]
  }
}

I’ve included enable_auto_scaling as it is central to my requirement.

Primary points:

If I have created the node pool with auto scaling enabled, node_count is set to null in state.
Azure recommends ignoring changes to node_count when auto scaling is enabled, but requires it to be set if auto scaling is disabled.
Therefore, if I disable auto scaling by setting enable_auto_scaling to false, I must specify node_count.
If I’ve ignored changes to node_count and disable auto scaling for one node pool, node_count remains null, which causes an error from the provider for that node pool because the provider is doing an in-place update.
I’ve added the terraform_data resource as a sort of proxy as you mentioned, to work around this.
After adding terraform_data, before making changes to the configuration, I ran terraform apply to get the resource into the state.
Then I set node_count for a single node pool to 1, and set enable_auto_scaling to false.
When I change the two values (enable_auto_scaling to false and node_count to 1) in the above code, let’s say for “bluez1”, all 4 node pools are getting planned for destroy and recreate when I run terraform plan, when I expected for only “bluez1” to be destroyed and recreated.

I see the following output in terraform plan which doesn’t make sense and required the workaround I detailed in my most recent comment before this one.

  # azurerm_kubernetes_cluster_node_pool.shared_worker_vmss["bluez1"] will be replaced due to changes in replace_triggered_by
-/+ resource "azurerm_kubernetes_cluster_node_pool" "shared_worker_vmss" {
...
  ~ enable_auto_scaling     = true -> false
  - max_count               = 15 -> null
  - min_count               = 1 -> null
  ~ node_count              = null -> 1
...
    }
  # azurerm_kubernetes_cluster_node_pool.shared_worker_vmss["bluez2"] will be replaced due to changes in replace_triggered_by
-/+ resource "azurerm_kubernetes_cluster_node_pool" "shared_worker_vmss" {
...
  ~ node_count              = null -> (known after apply)
...
    }
  # azurerm_kubernetes_cluster_node_pool.shared_worker_vmss["greenz1"] will be replaced due to changes in replace_triggered_by
-/+ resource "azurerm_kubernetes_cluster_node_pool" "shared_worker_vmss" {
...
  ~ node_count              = null -> (known after apply)
...
    }
  # azurerm_kubernetes_cluster_node_pool.shared_worker_vmss["greenz2"] will be replaced due to changes in replace_triggered_by
-/+ resource "azurerm_kubernetes_cluster_node_pool" "shared_worker_vmss" {
...
  ~ node_count              = null -> (known after apply)
...
    }
  # terraform_data.replace_on_node_count_change["bluez1"] will be updated in-place
  ~ resource "terraform_data" "replace_on_node_count_change" {
        id     = "a482cac0-f8ca-7ba1-bc2a-b43218c026f6"
      + input  = 2
      + output = (known after apply)
    }
  # terraform_data.replace_on_node_count_change["bluez2"] will be updated in-place
  ~ resource "terraform_data" "replace_on_node_count_change" {
        id     = "a58e1b99-55f3-e12d-b5ef-f1d2295612ba"
      + output = (known after apply)
    }
  # terraform_data.replace_on_node_count_change["greenz1"] will be updated in-place
  ~ resource "terraform_data" "replace_on_node_count_change" {
        id     = "587354d8-e5a6-674f-57f6-4e6e60eb40bb"
      + output = (known after apply)
    }
  # terraform_data.replace_on_node_count_change["greenz2"] will be updated in-place
  ~ resource "terraform_data" "replace_on_node_count_change" {
        id     = "5169b5a3-024c-860a-042d-c64c47f6b593"
      + output = (known after apply)
    }
Plan: 4 to add, 4 to change, 4 to destroy.

As you can see by the outputs from terraform_data, only one instance has its input actually changed, but all 4 instances are updated, and so replace_triggered_by in the node pool resource is improperly forcing replacement of all 4 node pools. You’ll also note that the node_count value for “bluez1” is going from null to 1 whereas for the other 3 node pools is going to be (known after apply) and additionally, only “bluez1” mentions switching enable_auto_scaling to false, and removes the max_count and min_count values by setting them to null whereas the others don’t – this is the same as how my output actually shows, I’ve only trimmed unnecessary/irrelevant lines. Those 3 fields are not getting updated in the other 3 node pools which is correct behavior as far as I am concerned, since I haven’t changed the values for those 3 node pools in the local variable.

jbardin · May 20, 2025, 2:59pm

If your’e prototyping out the syntax here, I would break out just the terraform_data alone to figure out why you’re not able to change only a single instance at a time.

The reference here is also invalid:

terraform_data.replace_on_node_count_change.input[each.key]

But I assume that is a typo and should be:

terraform_data.replace_on_node_count_change[each.key]

As for the terraform_data on its own, the change is quite subtle, and primarily due to your use of tomap in the for_each expression. Because that map for-expression initially has all null values with no type information, Terraform cannot determine the map type so ends up with map(any). However when you change a single node_count to a number, then the entire map type must be changed to accommodate that, which means the next plan will use map(number).

I would not have used tomap here since it adds no value and can cause hard to diagnose problems, but if the map type is of use to you in a more complex use case, it could be more fully defined with

 for_each = tomap({ for pool_name, pool_config in local.worker_nodepool_configs : pool_name => tonumber(pool_config.node_count) })

But either the above map, or more simply using the raw object will allow only a single instance to be changed at a time

resource "terraform_data" "replace_on_node_count_change" {
  for_each = local.worker_nodepool_configs
  input = each.value.node_count
}

thomas.spear · May 20, 2025, 3:50pm

Hi, thanks.

If your’e prototyping out the syntax here, I would break out just the terraform_data alone to figure out why you’re not able to change only a single instance at a time.

So, that’s where my second comment came in. I switched to triggers_replace instead of input in the below code, but I did also try it with input.

Code:

resource "terraform_data" "replace_on_node_count_change" {
  for_each         = tomap({ for key, value in local.worker_nodepool_configs : key => value.node_count })
  triggers_replace = each.value
}

Output:

  # terraform_data.replace_on_node_count_change["bluez1"] must be replaced
-/+ resource "terraform_data" "replace_on_node_count_change" {
      ~ id               = "a482cac0-f8ca-7ba1-bc2a-b43218c026f6" -> (known after apply)
      + triggers_replace = 2
  }
  # terraform_data.replace_on_node_count_change["bluez2"] will be updated in-place
  ~ resource "terraform_data" "replace_on_node_count_change" {
      ~ id = "a58e1b99-55f3-e12d-b5ef-f1d2295612ba" -> (known after apply)
  }

The reference here is also invalid:
terraform_data.replace_on_node_count_change.input[each.key]

Really? It seems to “work” (doesn’t throw any errors when I don’t have for_each in the terraform_data resource and instead use a { for key, value in ... } on the input field), as does referencing triggers_replace and so does referencing the id field on this resource. I’ll try without referencing any of them. Though all of it may be moot because I think you’re onto something with the next statement.

As for the terraform_data on its own, the change is quite subtle, and primarily due to your use of tomap in the for_each expression. Because that map for-expression initially has all null values with no type information, Terraform cannot determine the map type so ends up with map(any). However when you change a single node_count to a number, then the entire map type must be changed to accommodate that, which means the next plan will use map(number).

I would not have used tomap here since it adds no value and can cause hard to diagnose problems, but if the map type is of use to you in a more complex use case, it could be more fully defined with

Question, for_each only works with sets and maps. If I don’t use tomap what would the for_each line look like? The for_each, pool_name and pool_config.node_count are necessary, but if it doesn’t need tomap, how would this (your line pasted below) look?

for_each = tomap({ for pool_name, pool_config in local.worker_nodepool_configs : pool_name => tonumber(pool_config.node_count) })

Thanks for your insights! I do think adding tonumber() around pool_config.node_count will ultimately solve the issue because it’ll wrap null with tonumber() and keep the type fed to for_each consistent regardless of the value held by pool_config.node_count.

jbardin · May 20, 2025, 4:00pm

That would be a valid reference when the resource does not have for_each, but your last example did use for_each. You also wouldn’t want to be indexing the input attribute; you’re only looking for a change in the instance, so referencing only the instance will simplify things. The intent of replace_triggered_by is to couple the lifecycles of two resources, so while it’s valid to use more specific attributes of a resource, it’s often just overcomplicating things.

There does seem to be some documentation that specifies a map type as an argument to for_each, but an object is valid there as well, and can often be more predictable as seen here.

thomas.spear · May 20, 2025, 4:24pm

Interesting! I’ve always assumed objects needed to be converted to maps for for_each based on that documentation you mentioned. I’ll give both ways a try.

thomas.spear · May 20, 2025, 5:09pm

After testing both scenarios detailed below, I can confirm both work. The second way requires updating the state to convert the terraform_data inputs to objects before it properly functions, but that makes sense and I can comment out replace_triggered_by in the node pools temporarily to get that update made. Thank you for the help!

tomap with tonumber:

for_each = tomap({ for pool_name, pool_config in local.worker_nodepool_configs : pool_name => tonumber(pool_config.node_count) })

No tomap and no tonumber:

for_each = { for pool_name, pool_config in local.worker_nodepool_configs : pool_name => pool_config.node_count }

Topic		Replies	Views
When I change count to for_each, the resource will recreate? Terraform	2	3251	August 9, 2019
Replace_triggered_by behavior after terraform failure Terraform	3	45	May 29, 2025
Replace_triggered_by on conditionally created resources Terraform	3	2088	May 16, 2024
Terraform plan want to replace all components when a map of resources passed to a module has changed Terraform	4	1731	October 14, 2022
Update/replace resource when a dependency is changed #8099 Terraform	9	5903	August 13, 2024

Terraform_data and replace_triggered_by causing all instances of a resource to be replaced

Related topics