Terraform plan changes based on existence of depends_on in module

mimozell · December 14, 2021, 8:55am

Due to a bug in the aws provider, I noticed an interesting behaviour in Terraform that maybe someone can help explain.

I have modules a and b whose inputs/outputs don’t depend on each other, but I would like them to be provisioned one after the other.

If in module b I have a depends_on = [module.a] , and a changes, terraform will evaluate b for changes, even though nothing in b has changed. I notice this because a bug in b causes some data sources to change erroneously, which causes re-creation of some resources. So I see a diff based on b as well as a .

If in module b I don’t have a depends_on and a changes, b isn’t re-evaluated at all, so the bug that causes re-creation of resources isn’t observed. The diff is only based on the changes in a.

Any idea why b seems to be assessed differently depending on whether or not it has a depends_on?

Side note: the two modules are from 2 totally different providers (AAD and AWS), so the resources in them are really independent. The bug observed in the aws provider is the following: Unchanged data.aws_ssoadmin_instances.arns causes recreation of permission sets and account assignments · Issue #22188 · hashicorp/terraform-provider-aws · GitHub

It looks like having a depends_on causes all the data sources to be forcefully re-evaluated when a changes, in case b depends on data that a has modified.

apparentlymart · December 14, 2021, 4:32pm

Hi @mimozell,

I’m not sure I follow exactly what’s going on here from your description, and so I can’t give a specific answer for your case, but I can give a general note:

depends_on gives Terraform less information about your intent than a direct expression reference would, because it states that anything done to the dependency object must happen before anything done to the one declaring the dependency. As a result, Terraform will often make more conservative plans (that is: plans which assume less and therefore propose to change more to ensure correct ordering) with depends_on.

Depending on an entire module call is particularly tricky because you tell Terraform that it should order every operation planned inside the module before the object declaring the dependency.

You can typically get more precise results if you avoid using depends_on and instead use expression references to imply dependencies wherever possible. In that case, Terraform can see specifically which value the reference derives from and thus avoid proposing changes if that particular value hasn’t changed, even if other parts of the upstream object have planned changes.

mimozell · December 15, 2021, 11:07am

Thanks @apparentlymart I think it makes sense, although I didn’t expect it to be like this.

I tried to set up a simple example of my observation here:
GitHub - mimozell/depends-on-example. Let me know if you have any further comments based on it

tbugfinder · December 20, 2021, 9:35pm

Actually I made the same observation, that depends_on not only changes the order but also how values are computed ending up in different plan/apply cycles on values which shouldn’t change - not matter of depends_on is used or not.

This issue comes very close to me case.

github.com/hashicorp/terraform-provider-aws

Terraform thinks user-data is changing when it isn't, resulting in unnecessary resource replacement

opened 07:20PM - 27 Jun 18 UTC

ghost

bug service/autoscaling

_This issue was originally opened by @Lsquared13 as hashicorp/terraform#18343. I…t was migrated here as a result of the [provider split](https://www.hashicorp.com/blog/upcoming-provider-changes-in-terraform-0-10/). The original body of the issue is below._ <hr>  ### Terraform Version  ``` Terraform v0.11.7 + provider.aws v1.25.0 + provider.local v1.1.0 + provider.null v1.0.0 + provider.template v1.0.0 + provider.tls v1.1.0 ``` ### Terraform Configuration Files  ```hcl module "vault_cluster" { source = "github.com/hashicorp/terraform-aws-vault.git//modules/vault-cluster?ref=v0.0.8" cluster_name = "REDACTED" cluster_size = "${var.vault_cluster_size}" instance_type = "${var.vault_instance_type}" ami_id = "${var.vault_consul_ami}" user_data = "${data.template_file.user_data_vault_cluster.rendered}" s3_bucket_name = "${aws_s3_bucket.REDACTED.id}" force_destroy_s3_bucket = "${var.force_destroy_s3_bucket}" vpc_id = "${var.aws_vpc}" subnet_ids = "${aws_subnet.vault.*.id}" target_group_arns = ["${aws_lb_target_group.REDACTED.arn}"] allowed_ssh_cidr_blocks = ["0.0.0.0/0"] allowed_inbound_cidr_blocks = ["0.0.0.0/0"] allowed_inbound_security_group_ids = [] ssh_key_name = "${aws_key_pair.auth.id}" } data "template_file" "user_data_vault_cluster" { template = "${file("${path.module}/user-data/user-data-vault.sh")}" vars { aws_region = "${var.aws_region}" s3_bucket_name = "${aws_s3_bucket.REDACTED.id}" consul_cluster_tag_key = "${module.consul_cluster.cluster_tag_key}" consul_cluster_tag_value = "${module.consul_cluster.cluster_tag_value}" vault_cert_bucket = "${aws_s3_bucket.vault_certs.bucket}" REDACTED_role = "${var.REDACTED_role}" REDACTED_role = "${var.REDACTED_role}" } } ``` ### Expected Behavior  I expect that since none of the user data variables has changed, my second time running `terraform init` proposes no changes to the infrastructure. The removal of the tag on the s3 bucket could be ignored though I still find it confusing. (See Actual Behavior for more info) ### Actual Behavior  The second time I run `terraform init` it proposes the following plan. In particular, my issue is that the unexpected user data hash change is forcing a new launch configuration which is forcing a new autoscaling group. This makes multiple `apply` operations destructive. ``` Terraform will perform the following actions: <= module.REDACTED_vault.data.template_file.user_data_vault_cluster id: <computed> rendered: <computed> template: "REDACTED" vars.%: "7" vars.aws_region: "us-east-1" vars.consul_cluster_tag_key: "consul-cluster" vars.consul_cluster_tag_value: "REDACTED-consul" vars.REDACTED_role: "REDACTED-20180627163921901300000003" vars.s3_bucket_name: "REDACTED-2018062716392224750000000b" vars.REDACTED_role: "REDACTED-20180627163921914400000005" vars.vault_cert_bucket: "REDACTED-vault-certs-2018062716392226290000000c" ~ module.REDACTED_vault.aws_s3_bucket.REDACTED_vault tags.%: "1" => "0" tags.Description: "Used for secret storage with Vault. DO NOT DELETE this Bucket unless you know what you are doing." => "" ~ module.REDACTED_vault.module.vault_cluster.aws_autoscaling_group.autoscaling_group launch_configuration: "REDACTED-vault-20180627164208768300000021" => "${aws_launch_configuration.launch_configuration.name}" -/+ module.REDACTED_vault.module.vault_cluster.aws_launch_configuration.launch_configuration (new resource required) id: "REDACTED-vault-20180627164208768300000021" => <computed> (forces new resource) associate_public_ip_address: "false" => "false" ebs_block_device.#: "0" => <computed> ebs_optimized: "false" => "false" enable_monitoring: "true" => "true" iam_instance_profile: "REDACTED-vault2018062716392283740000000f" => "REDACTED-vault2018062716392283740000000f" image_id: "ami-REDACTED" => "ami-REDACTED" instance_type: "t2.medium" => "t2.medium" key_name: "REDACTED-key-20180627163922247100000007" => "REDACTED-key-20180627163922247100000007" name: "REDACTED-vault-20180627164208768300000021" => <computed> name_prefix: "REDACTED-vault-" => "REDACTED-vault-" placement_tenancy: "default" => "default" root_block_device.#: "1" => "1" root_block_device.0.delete_on_termination: "true" => "true" root_block_device.0.iops: "0" => <computed> root_block_device.0.volume_size: "50" => "50" root_block_device.0.volume_type: "standard" => "standard" security_groups.#: "1" => "1" security_groups.879695302: "sg-11f5815a" => "sg-11f5815a" user_data: "9391db96cfba819eefef3353a3f01daf3a50b4ab" => "643cd0eab8f3d7def9cef600a65268cf39d49ecf" (forces new resource) ``` ### Steps to Reproduce  1. `terraform apply` 2. `terraform apply` ### References  - hashicorp/terraform#4197 I thought this issue might be related but upgrading to aws provider version 1.25 did not help

apparentlymart · December 20, 2021, 11:22pm

One specific way that depends_on can affect an outcome is to force a value to be unknown “(known after apply)” rather than to be a concrete known value. That is a consequence of forcing particular read actions to happen during the apply step instead of the plan step, and is by design.

As I mentioned above, depends_on gives Terraform less information about your intent and so it knows less during planning and so it will include more unknown values during the plan, affecting the outcome. The solution is to use more precise declarations of dependencies, ideally involving direct references to particular values. depends_on for modules is particularly tricky, because it effectively creates many additional dependency edges all at once; this is why we resisted adding depends_on for modules for a long time, but eventually added it due to high demand even though it does come with this significant downside.

tbugfinder · December 21, 2021, 8:19pm

As engineers we think being smarter than terraform, sometimes
I went back to the documentation and couldn’t find a note that depends_on has some side-effects other than just changing the order of resource deployments.

apparentlymart · December 21, 2021, 11:16pm

Hi @tbugfinder,

I suppose it’s reasonable to disagree about what exactly “changing the order of resource deployments” might mean; to me, data source reads are one kind of action whose ordering is affected, and thus it is intuitive to me that depends_on can make a data source read happen after, for example, a managed resource “create” action, during the apply step.

But I can also see it as reasonable that you might expect a “read” to be a different category of action than the others which isn’t subject to dependencies. Indeed, earlier versions of Terraform did treat them that way, which was continually reported as a bug – Terraform was sometimes trying to read something before it’s been created/updated and thus getting the wrong answer – which we agreed with and therefore fixed it so that the ordering would be correct across all actions.

But it certainly couldn’t hurt to be more explicit in the documentation about what exactly dependencies affect, and therefore what affects introducing a new one (regardless of how you do it) might have. It would also be helpful, once that information is somewhere, to update the depends_on documentation you referred to so that it’s clear that depends_on is less precise than an expression reference and can thus have a much more severe effect on the ordering than an expression reference can.

With that said, the folks who primarily maintain the documentation don’t tend to closely follow our discussions in this forum, so I think it’d help to report that as a documentation feature request in the Terraform GitHub repository. If you do so, I’d suggest stating in your own words what you currently understand the behavior to be (based on this discussion) and what you originally expected (before we had this discussion), because documentation writers can then take cues from how you describe it when thinking about how best to present and organize the updated documentation.

Thanks!

jbardin · December 22, 2021, 5:29pm

We do have a separate section under Data Sources: Data Resource Dependencies, but that could probably be mode more discoverable from the other page.

Topic		Replies	Views
Dependent module and data source Terraform	3	1312	December 8, 2020
Module depends_on modifies the called module Terraform	9	2405	December 8, 2022
Terraform doesn't respect dependency during refresh Terraform	6	872	November 2, 2021
Dependency model, `depends_on` vs data reference Terraform	1	54	November 26, 2024
Why does terraform think that I might be modifying an availability zone Terraform	7	84	November 6, 2024

Terraform plan changes based on existence of depends_on in module

Related topics