We recently upgraded to terraform 1.0.0. We were at 0.13.7 for a long time.
While the upgrade went fine, a recent change was done in one of the ec2 wrapper module (totally unrelated to “data external”)
That external script for us is used to determine the selected ip addresses (based on our own criterias ) to use while launching ec2 instances in for_each blocks. While the upgrade went fine for most of the instance sets (lets say A, B, C), while upgrading certain other instance sets (e.g. D), terraform started complaining about “The “for_each” value depends on resource attributes that cannot be determined until apply, so Terraform cannot predict how many instances will be created. To work around this, use the -target argument to first apply” for A & B.
Reverting the module versions of D or even A,B,C,D doesn’t solve the problem. The response is inconsistent
│ Error: Invalid for_each argument
│
│ on .terraform/modules/A/main.tf line 47, in resource "aws_instance" "this":
│ 47: for_each = (var.number_of_instances > 0 && local.keys_rotation_enabled) ? module.common.filtered_iprange_output : toset([])
│ ├────────────────
│ │ local.keys_rotation_enabled is true
│ │ module.common.filtered_iprange_output is a list of string, known only after apply
│ │ var.number_of_instances is 4
│
│ The "for_each" value depends on resource attributes that cannot be determined until apply, so Terraform cannot predict how many instances will be created. To work around this, use the -target argument to first apply
│ only the resources that the for_each depends on.
module.common.filtered_iprange_output is just a local variable in the common module that is split(",",data.external.cidr_expander.result.filtered_iprange)
However i do query the state file, that the data external is already calculated
$ terraform state show module.A.module.common.data.external.cidr_expander
module.A.module.common.data.external.cidr_expander:
data "external" "cidr_expander" {
id = "-"
program = [
"python",
".terraform/modules/A.common/scripts/myfunction.py",
]
query = {
"blacklisted_ips" = jsonencode([])
"cidr_blocks" = jsonencode(
[
"10.93.81.111",
"10.93.81.112",
"10.93.81.113",
"10.93.81.114",
"10.93.81.115",
]
)
"offset" = "0"
"range_hop" = "1"
}
result = {
"filtered_iprange" = "['10.93.81.111', '10.93.81.112', '10.93.81.113', '10.93.81.114', '10.93.81.115']"
}
}
I am not sure what behaviour has changed in terraform 1.0.0 from 0.13
Terraform should know the output of module.common.filtered_iprange_output even in the plan phase.
There have been lot of requests that have requested not to run data resources in the plan phase or either to simply refresh it.
I am not sure if any change was done around these that is causing side effects for me.
am guessing it could be an aftereffect Data Resource Lifecycle Adjustments · Issue #17034 · hashicorp/terraform · GitHub
Data source’s arguments are fully known during the plan phase as it just a python script that takes some input and statically generates the output. However my real question is how upgrading of a module D cause this problem to start happening to A & B without any changes to them or having any dependency to D
Its odd also as when removing problematic A & B, terraform does say that it wants to read “data” “external” for some modules during the apply phase. It looks unnecessary as the “data” “external” can be calculated in the plan phase itself
# module.C.module.common.data.external.cidr_expander will be read during apply
# (config refers to values not yet known)
<= data "external" "cidr_expander" {
+ id = (known after apply)
+ program = [
+ "python",
+ ".terraform/modules/C.common/scripts/myfuction.py",
]
+ query = {
+ "blacklisted_ips" = jsonencode([])
+ "cidr_blocks" = jsonencode([])
+ "offset" = "0"
+ "range_hop" = "1"
}
+ result = (known after apply)
}
I later on removed for_each dependency on A and discovered that it too wants to calculate data external during the apply phase
# module.A.module.common.data.external.cidr_expander will be read during apply
# (config refers to values not yet known)
<= data "external" "cidr_expander" {
+ id = (known after apply)
+ program = [
+ "python",
+ ".terraform/modules/A.common/scripts/myfunction.py",
]
+ query = {
+ "blacklisted_ips" = jsonencode([])
+ "cidr_blocks" = jsonencode(
[
+ "10.93.81.88/30",
]
)
+ "offset" = "1"
+ "range_hop" = "0"
}
+ result = (known after apply)
}
There is no secret in myfunction.py. It just takes array of CIDR ranges and break down into array of ip addresses in python.
data "external" "cidr_expander" {
program = ["python", "${path.module}/scripts/myfunction.py"]
query = {
cidr_blocks = jsonencode(var.private_ip)
blacklisted_ips = jsonencode(var.blacklisted_ip)
range_hop = var.ip_range_hop
offset = var.start_offset
}
}
All the arguments of the query are known in the plan phase itself. Wonder what is stoppping terraform to execute it during plan phase itself.
Interestingly, while experimenting, i see a noop too is not applied during plan phase.
# module.A.module.common.data.external.noop will be read during apply
# (config refers to values not yet known)
<= data "external" "noop" {
+ id = (known after apply)
+ program = [
+ "python",
+ "-c",
+ "print('{}')",
]
+ result = (known after apply)
}
whose source is
data "external" "noop" {
program = ["python", "-c", "print('{}')"]
}