Replace_triggered_by behavior after terraform failure

vipetrul · May 16, 2025, 5:03pm

I’m working with the following Terraform setup:

Resource_A
Resource_B, which includes replace_triggered_by = [Resource_A]

In the initial terraform plan, both Resource_A and Resource_B are planned for updates—Resource_B being triggered for replacement due to changes in Resource_A.

During terraform apply, Resource_A is updated successfully, but the apply fails during the replacement of Resource_B due to an API throttling error. As a result, Resource_B is not replaced.

On the subsequent terraform plan, no changes to Resource_B are detected, which is unexpected because its replacement was only partially executed and never completed.

Question:

What is the best way to handle this situation to ensure that dependent resources like Resource_B are not skipped after a partial apply failure, especially when using replace_triggered_by?

thomas.spear · May 18, 2025, 5:32pm

This sounds like an issue I’ve seen before where a terraform apply updates the state before the API that the provider is calling returns successfully.

When replace_triggered_by is not involved, you can run terraform refresh to make the state consistent with the current configuration, followed by terraform apply to make your updates. Given that replace_triggered_by is involved, doing this would not work.

The refresh would make Resource_B’s state consistent and the second apply would see no change to Resource_A since it was already consistent, leading to Resource_B having no change/replacement unless it depended on a changed output from Resource_A, which is what you’re seeing happen in your case anyway.

There is a solution I can think of which involves using an intermediary terraform_data resource and having Resource_B configured to replace_triggered_by a change to either of Resource_A or the terraform_data resource:

resource "some_resource" "Resource_A" {
...
}

resource "terraform_data" "intermediary_resource" {
  triggers_replace = {
    input_to_force_replacement = var.some_dummy_input
  }
}

resource "some_resource" "Resource_B" {
  lifecycle {
    replace_triggered_by = [some_resource.Resource_A, terraform_data.intermediary_resource]
  }
}

In this setup, you now have an input variable which you can change the value of arbitrarily to force the replacement in case of this failure occurring again.

The replace_triggered_by is a list, so it will trigger when either of these resources changes, and does not require both to change in order to trigger replacement. So in this case, you would just make an arbitrary variable not used anywhere else in your configuration a reference point to replace the terraform_data resource which, upon being replaced triggers Resource_B to be replaced. The trick is simple: update terraform.tfvars value for var.some_dummy_input and run terraform apply to force the replacement of Resource_B in case this happens again.

jbardin · May 19, 2025, 8:23pm

The replace_triggered_by uses the presence of changes to determine the action to take, so as you see if the triggering changes are gone, it doesn’t take any actions!

The direct solution here is to use the -replace flag to tell terraform which instances still require replacement.

This was an accepted drawback initially, because failures can often require manual intervention to repair anyway, like when a resource exceeded the timeout, or was created in a broken state.

I think a case could be made that Terraform should taint the instances which planned replacement due to replace_triggered_by but end up not being replaced during apply. It’s not exactly straightforward to implement, because those instances are never visited due to the earlier failure, but should be possible.

Actually I’m not sure why they weren’t tainted in the first place, but maybe the failure was happening before the replacement process actually starts, that might be worth looking into as well.

vipetrul · May 29, 2025, 1:44pm

Thank you for the detailed responses. It sounds like, at this point, the best we can do is remediate the side effects manually once the issue occurs. I agree that tainting resources that were planned for replacement - but not actually replaced due to a failure - would be a valuable improvement. Hopefully, this can be considered for future enhancements to Terraform.

Topic		Replies	Views
Replace_triggered_by on conditionally created resources Terraform	3	2086	May 16, 2024
Terraform_data and replace_triggered_by causing all instances of a resource to be replaced Terraform	11	92	May 20, 2025
Terraform doesn't respect dependency during refresh Terraform	6	917	November 2, 2021
Force new resource based on API/Read difference Terraform	5	1571	October 14, 2021
How to fail acceptance test when RequiresReplace() is triggered? Plugin Development	3	132	March 11, 2024

Replace_triggered_by behavior after terraform failure

Question:

Related topics