I’m working with the following Terraform setup:
Resource_A
Resource_B
, which includes replace_triggered_by = [Resource_A]
In the initial terraform plan
, both Resource_A
and Resource_B
are planned for updates—Resource_B
being triggered for replacement due to changes in Resource_A
.
During terraform apply
, Resource_A
is updated successfully, but the apply fails during the replacement of Resource_B
due to an API throttling error. As a result, Resource_B
is not replaced.
On the subsequent terraform plan
, no changes to Resource_B
are detected, which is unexpected because its replacement was only partially executed and never completed.
Question:
What is the best way to handle this situation to ensure that dependent resources like Resource_B
are not skipped after a partial apply failure, especially when using replace_triggered_by
?
This sounds like an issue I’ve seen before where a terraform apply updates the state before the API that the provider is calling returns successfully.
When replace_triggered_by
is not involved, you can run terraform refresh
to make the state consistent with the current configuration, followed by terraform apply
to make your updates. Given that replace_triggered_by
is involved, doing this would not work.
The refresh would make Resource_B
’s state consistent and the second apply would see no change to Resource_A
since it was already consistent, leading to Resource_B
having no change/replacement unless it depended on a changed output from Resource_A
, which is what you’re seeing happen in your case anyway.
There is a solution I can think of which involves using an intermediary terraform_data
resource and having Resource_B
configured to replace_triggered_by
a change to either of Resource_A
or the terraform_data
resource:
resource "some_resource" "Resource_A" {
...
}
resource "terraform_data" "intermediary_resource" {
triggers_replace = {
input_to_force_replacement = var.some_dummy_input
}
}
resource "some_resource" "Resource_B" {
lifecycle {
replace_triggered_by = [some_resource.Resource_A, terraform_data.intermediary_resource]
}
}
In this setup, you now have an input variable which you can change the value of arbitrarily to force the replacement in case of this failure occurring again.
The replace_triggered_by
is a list, so it will trigger when either of these resources changes, and does not require both to change in order to trigger replacement. So in this case, you would just make an arbitrary variable not used anywhere else in your configuration a reference point to replace the terraform_data
resource which, upon being replaced triggers Resource_B
to be replaced. The trick is simple: update terraform.tfvars value for var.some_dummy_input
and run terraform apply
to force the replacement of Resource_B
in case this happens again.
The replace_triggered_by
uses the presence of changes to determine the action to take, so as you see if the triggering changes are gone, it doesn’t take any actions! 
The direct solution here is to use the -replace
flag to tell terraform which instances still require replacement.
This was an accepted drawback initially, because failures can often require manual intervention to repair anyway, like when a resource exceeded the timeout, or was created in a broken state.
I think a case could be made that Terraform should taint the instances which planned replacement due to replace_triggered_by
but end up not being replaced during apply. It’s not exactly straightforward to implement, because those instances are never visited due to the earlier failure, but should be possible.
Actually I’m not sure why they weren’t tainted in the first place, but maybe the failure was happening before the replacement process actually starts, that might be worth looking into as well.
Thank you for the detailed responses. It sounds like, at this point, the best we can do is remediate the side effects manually once the issue occurs. I agree that tainting resources that were planned for replacement - but not actually replaced due to a failure - would be a valuable improvement. Hopefully, this can be considered for future enhancements to Terraform.