"Invalid for_each argument" with data source result

I cannot get this to work with for_each blocks, I’m getting:

Error: Invalid for_each argument

  on ../modules/eksctl-config/data.tf line 30, in data "aws_subnet" "private_subnets":
  30:   for_each = data.aws_subnet_ids.private_subnet_ids.ids

The "for_each" value depends on resource attributes that cannot be determined
until apply, so Terraform cannot predict how many instances will be created.
To work around this, use the -target argument to first apply only the
resources that the for_each depends on.

I tried adding a dependency on the variable:

data "aws_subnet" "private_subnets" {
  for_each = data.aws_subnet_ids.private_subnet_ids.ids
  id       = each.value
  depends_on      = [var.vpc_id]
}

But I still get the errors.

Hi @darkn3rd,

The requirement for for_each and count is that their values must always be known at plan time. depends_on doesn’t affect that, because Terraform still needs to show in the plan how many instances of aws_subnet.private_subnets there will be during apply.

To make this work you’ll need to rework your design so that the data "aws_subnet_ids" "private_subnet_ids" configuration only uses values that are known during planning, which will then allow that data resource to be handled during planning rather than deferred to the apply step.

Unfortunately I can’t give more details on how to achieve that based on what you’ve shared so far. It seems like this configuration is both managing a VPC and reading its subnets, which is an unusual combination: if this configuration is responsible for creating the VPC then I’d expect it to be responsible for creating the subnets too, or else it would always find that there are no subnets in the VPC immediately after it was created.

If you can share a little more about how your configuration is put together then I might be able to make some more specific suggestions.

I have a module that creates a config script based on passing a vpc_id. If I try to get that vpc_id from a vpc that I will create, I start getting errors like this:

Error: Invalid for_each argument

  on ../modules/eksctl-config/data.tf line 30, in data "aws_subnet" "private_subnets":
  30:   for_each = data.aws_subnet_ids.private_subnet_ids.ids

The "for_each" value depends on resource attributes that cannot be determined
until apply, so Terraform cannot predict how many instances will be created.
To work around this, use the -target argument to first apply only the
resources that the for_each depends on.

I’m not sure how to handle the situation. My module inputs look like this:

module "vpc" {
  source           = "../modules/eks-vpc"
  name             = var.name
  region           = var.region
  eks_cluster_name = var.eks_cluster_name
}

module "config" {
  source          = "../modules/eksctl-config"
  name            = var.eks_cluster_name
  region          = var.region
  vpc_id          = module.vpc.vpc_id
  instance_type   = "m5.2xlarge"
  public_key_name = "joaquin"
  filename        = "${path.module}/../../eksctl/cluster_config.yaml"
}

Hi again, @darkn3rd! I found this other topic you created about the same problem that had some additional information in it, so I’ve merged the two here.

This does give me a little more information to work with, and I see now that you are probably having the config module accept the VPC ID and then look up the subnets itself, and so that’s why you’re seeing this problem: it can’t look up the subnets until the VPC is created, and so the for_each in the module is invalid.

This situation is a good situation for Module Composition, and you’re already doing a dependency-inversion-style approach here of having the config module receive the VPC ID rather than managing the VPC itself. To avoid the problem you saw, I would recommend taking the dependency inversion pattern even further by making the config module take the subnets as arguments too, rather than looking them up itself, and then have the first module export that information.

To make that less hypothetical and more real, I’m going to show an example of one way to do that. There are other variants of this and so I’d encourage you to experiment and see what feels like the best trade-off, but here’s one example that should work…

First, we need to think about what the outputs from the vpc module are. Clearly you already have a vpc_id output value declared, but I can’t tell if you already have it exporting the subnet ids. I’d write the outputs something like the following, using object values to group together all of the related information:

output "vpc" {
  value = {
    id         = aws_vpc.example.id
    cidr_block = aws_vpc.example.cidr_block
  }
}

output "private_subnets" {
  value = {
    # I'm assuming here that availability_zone is
    # a suitable unique identifier for your subnets.
    # If not, select some other expression but be
    # sure it only derives from values that can
    # be known at plan time.
    for s in aws_subnet.example : s.availability_zone => {
      id                = s.id
      vpc_id            = s.vpc_id
      availability_zone = s.availability_zone
      cidr_block        = s.cidr_block
    }
  }
}

The value representing a module call has an object type whose attributes match the declared output values, so with the above outputs we know that module.vpc will at least have vpc and private_subnets attributes. We can then add an input variable to the config module whose type constraint includes the VPC-related information the module needs in a shape that matches the module’s object type:

variable "network" {
  type = object({
    vpc = object({
      id = string
    })
    private_subnets = map(object({
      id = string
    }))
  })
}

Notice that the above type constraint only includes a subset of the attributes in the module’s actual output values. That’s okay because Terraform’s type system allows automatic type conversion of any object type that has at least the attributes given in the type constraint.

Your config module can then use var.network.vpc.id to access the VPC ID and var.network.private_subnets to get information about all of the subnets. By making sure that the keys of the private_subnets map are values known at plan time, it is safe to use var.network.private_subnets as a for_each expression elsewhere in that module.

Finally, you can tie those two modules together by assigning the VPC module object itself to the network argument of the config module, which Terraform should accept because the type constraint will match it:

module "vpc" {
  source           = "../modules/eks-vpc"
  name             = var.name
  region           = var.region
  eks_cluster_name = var.eks_cluster_name
}

module "config" {
  source          = "../modules/eksctl-config"
  name            = var.eks_cluster_name
  region          = var.region
  network         = module.vpc
  instance_type   = "m5.2xlarge"
  public_key_name = "joaquin"
  filename        = "${path.module}/../../eksctl/cluster_config.yaml"
}

The overall idea here is for the objects created in the vpc module to flow directly into the config module, rather than having the config module go and fetch them itself. That makes the data flow between the modules clearer, and in turn gives a better representation of the dependencies between the objects in the vpc module and the objects in the config module.

If you have other modules that also depend on the network information, you can write them with similar variable "network" blocks that each declare a type constraint that’s a different subset of the module.vpc type, each reflecting the parts of that data structure it depends on, and thus you can establish a convention of passing module.vpc into each of the modules that interacts with that VPC.

1 Like