Terraform plan changes based on existence of depends_on in module

Due to a bug in the aws provider, I noticed an interesting behaviour in Terraform that maybe someone can help explain.

I have modules a and b whose inputs/outputs don’t depend on each other, but I would like them to be provisioned one after the other.

If in module b I have a depends_on = [module.a] , and a changes, terraform will evaluate b for changes, even though nothing in b has changed. I notice this because a bug in b causes some data sources to change erroneously, which causes re-creation of some resources. So I see a diff based on b as well as a .

If in module b I don’t have a depends_on and a changes, b isn’t re-evaluated at all, so the bug that causes re-creation of resources isn’t observed. The diff is only based on the changes in a.

Any idea why b seems to be assessed differently depending on whether or not it has a depends_on?

Side note: the two modules are from 2 totally different providers (AAD and AWS), so the resources in them are really independent. The bug observed in the aws provider is the following: Unchanged data.aws_ssoadmin_instances.arns causes recreation of permission sets and account assignments · Issue #22188 · hashicorp/terraform-provider-aws · GitHub

It looks like having a depends_on causes all the data sources to be forcefully re-evaluated when a changes, in case b depends on data that a has modified.

Hi @mimozell,

I’m not sure I follow exactly what’s going on here from your description, and so I can’t give a specific answer for your case, but I can give a general note:

depends_on gives Terraform less information about your intent than a direct expression reference would, because it states that anything done to the dependency object must happen before anything done to the one declaring the dependency. As a result, Terraform will often make more conservative plans (that is: plans which assume less and therefore propose to change more to ensure correct ordering) with depends_on.

Depending on an entire module call is particularly tricky because you tell Terraform that it should order every operation planned inside the module before the object declaring the dependency.

You can typically get more precise results if you avoid using depends_on and instead use expression references to imply dependencies wherever possible. In that case, Terraform can see specifically which value the reference derives from and thus avoid proposing changes if that particular value hasn’t changed, even if other parts of the upstream object have planned changes.

1 Like

Thanks @apparentlymart :slight_smile: I think it makes sense, although I didn’t expect it to be like this.

I tried to set up a simple example of my observation here:
GitHub - mimozell/depends-on-example. Let me know if you have any further comments based on it :slight_smile:

Actually I made the same observation, that depends_on not only changes the order but also how values are computed ending up in different plan/apply cycles on values which shouldn’t change - not matter of depends_on is used or not.

This issue comes very close to me case.

One specific way that depends_on can affect an outcome is to force a value to be unknown “(known after apply)” rather than to be a concrete known value. That is a consequence of forcing particular read actions to happen during the apply step instead of the plan step, and is by design.

As I mentioned above, depends_on gives Terraform less information about your intent and so it knows less during planning and so it will include more unknown values during the plan, affecting the outcome. The solution is to use more precise declarations of dependencies, ideally involving direct references to particular values. depends_on for modules is particularly tricky, because it effectively creates many additional dependency edges all at once; this is why we resisted adding depends_on for modules for a long time, but eventually added it due to high demand even though it does come with this significant downside.

As engineers we think being smarter than terraform, sometimes :slight_smile:
I went back to the documentation and couldn’t find a note that depends_on has some side-effects other than just changing the order of resource deployments.

Hi @tbugfinder,

I suppose it’s reasonable to disagree about what exactly “changing the order of resource deployments” might mean; to me, data source reads are one kind of action whose ordering is affected, and thus it is intuitive to me that depends_on can make a data source read happen after, for example, a managed resource “create” action, during the apply step.

But I can also see it as reasonable that you might expect a “read” to be a different category of action than the others which isn’t subject to dependencies. Indeed, earlier versions of Terraform did treat them that way, which was continually reported as a bug – Terraform was sometimes trying to read something before it’s been created/updated and thus getting the wrong answer – which we agreed with and therefore fixed it so that the ordering would be correct across all actions.

But it certainly couldn’t hurt to be more explicit in the documentation about what exactly dependencies affect, and therefore what affects introducing a new one (regardless of how you do it) might have. It would also be helpful, once that information is somewhere, to update the depends_on documentation you referred to so that it’s clear that depends_on is less precise than an expression reference and can thus have a much more severe effect on the ordering than an expression reference can.

With that said, the folks who primarily maintain the documentation don’t tend to closely follow our discussions in this forum, so I think it’d help to report that as a documentation feature request in the Terraform GitHub repository. If you do so, I’d suggest stating in your own words what you currently understand the behavior to be (based on this discussion) and what you originally expected (before we had this discussion), because documentation writers can then take cues from how you describe it when thinking about how best to present and organize the updated documentation.

Thanks!

1 Like

We do have a separate section under Data Sources: Data Resource Dependencies, but that could probably be mode more discoverable from the other page.