Is it possible to use variable data sources in a for_each?

I have a couple of remote state datasources based on AWS account names whose purpose it is to get the account ID.

I something similar to this in a data_sources.tf:

data "terraform_remote_state" "account1" {
    backend = "s3"
    config = {
      bucket = blah
      ...etc
    }
  }
  data "terraform_remote_state" "account2" {
   ...etc
  }

in my main.tf, I’m creating a list variable with the account names, and I’d like to do a for_each on the list of account names and use their associated data sources to retrieve the accountID on-the-fly.
I’ve tried something like below, but can’t get the syntax right and I’m not even sure it’s possible to do this, but I’m hopeful.

variable "accounts" {
  type = list
  default = [
    "account1",
    "account2"
  ]
}

module "test" {
  source = mysource
  for_each = toset(var.accounts)
  account_number = data.terraform_remote_state.${each.key}.outputs.account_id

Any help or suggestions are greatly appreciated. I’ve been working on this and googling for a couple hours now and I’m stuck.

Hi @cochrasc,

In order to work dynamically with resource instances like that you need to construct them systematically with for_each, which then makes Terraform consider them to be a map of objects identified by the for_each keys.

For example:

variable "accounts" {
  # set rather than list because these don't seem to be in
  # any significant order; we're just using them for instance
  # identifiers.
  type = set(string)
}

data "terraform_remote_state" "accounts" {
  for_each = var.accounts

  backend = "s3"
  config = {
    # Assuming a bucket naming scheme of "terraform-"
    # followed by the account key. Adapt as you need.
    bucket = "terraform-${each.key}"
  }
}

module "test" {
  source   = "./example"
  for_each = data.terraform_remote_state.accounts

  account_number = each.value.outputs.account_id
}

When you have multiple resources or modules that are all repeated based on the same basis, it’s common to use the result of one as the for_each for another to help explain both to human readers and to Terraform itself that e.g. in this case there will be one instance of module "test" for each instance of data.terraform_remote_state.accounts, and thus adding new elements to var.accounts will properly thread through adding all of the necessary additional instances in the appropriate order. This works because a resource with for_each set appears in expressions as a map of objects, and a map of objects is a valid for_each argument itself.

Thank you for the reply. Unfortunately, the structure of my state file locations would be difficult to do in the way you mentioned. In specifying the terraform_remote_state, I would basically need a map of the accounts and their terraform.tfstate locations since the directory structure can be very different depending on if the account is pci vs non-pci, dev/nonprod/prod, etc.
For example, the account “pci_eks_nonprod” has this as the backend key:
“my_org/pci_ou/nonprod_ou/accounts/pci_eks_nonprod/create_account/terraform.tfstate”
Whereas “non-pci_eks_prod” has a key of:
“my_org/non-pci_ou/prod_ou/accounts/pci_eks_prod/create_account/terraform.tfstate”
If I set up a map like this:
variable “account_map” {
account1 = stateLocation1
account2 = stateLocation2
}
How would I use this map to define the key in the for_each? I don’t think you can do this, which is the crux of my issue as posted.
key = ${var.account_map.${each.key}.value}"

I hope what I’m asking here makes sense.

If the variances between the different accounts are still in some sense systematic then you can use a map of objects instead of a set of string to capture the essential differences between them, and then use those values to express the configuration differences.

For example:

variable "accounts" {
  type = map(object({
    bucket = string
    path   = string
    region = string
  }))
}

data "terraform_remote_state" "accounts" {
  for_each = var.accounts

  backend = "s3"
  config = {
    bucket = each.value.bucket
    key    = each.value.path
    region = each.value.region
  }
}

Alternatively, if you’d rather keep the specifics of each of the accounts encapsulated inside your module and retain the external API of having the caller just choose a subset of those by key, you can add a little more indirection but get the same result like this:

variable "accounts" {
  type = set(string)
}

locals {
  account_backend_configs = {
    pci_eks_nonprod = {
      bucket = "example1-bucket"
      path   = "my_org/pci_ou/nonprod_ou/accounts/pci_eks_nonprod/create_account/terraform.tfstate"
      region = "us-east-1"
    }
    non-pci_eks_prod = {
      bucket = "example1-bucket"
      path   = "my_org/non-pci_ou/prod_ou/accounts/pci_eks_prod/create_account/terraform.tfstate"
      region = "us-east-1"
    }
  }
}

data "terraform_remote_state" "accounts" {
  for_each = {
    for key in var.accounts : key => local.account_backend_configs[key]
  }

  backend = "s3"
  config = {
    bucket = each.value.bucket
    key    = each.value.path
    region = each.value.region
  }
}

One final option would be to manually construct a mapping like the one Terraform would create with for_each, which allows you to choose arbitrarily which resources belong to the map vs. the systematic approach that for_each requires:

locals {
  account_outputs = {
    account1 = data.terraform_remote_state.account1
    account2 = data.terraform_remote_state.account2
  }
}

Then you can refer to local.account_outputs[each.key] to look the accounts up by key, even though they were not constructed systematically.