Using for, foreach to declare multiple resources

I am trying to create a module that creates an EC2 instance with a variable number of attached volumes. The volumes need to be declared as separate resources and become attached to the instance via the volume_attachment resource, because this is for a fileserver where the volumes will persist if I do a terraform apply using a new AMI id (i.e., I deploy the server, use it to write some data to the volumes, then I create a new AMI, run a terraform apply, and terraform destroys the attachment resource, retains the volumes because they haven’t changed, destroys the instance, launches a new instance, and creates new attachment resources which will attach the old volumes to the new instance).

I am in need of help getting the syntax and correct method for declaring the various resources and their links using for_each.

Example:
I have a .tf file which will contain 3 modules. The modules are for the same subfolder, ec2_instance, where the module definition passes parameters in, which takes a “vols” parameter containing the volume definitions. For each invocation of the module, there are differing numbers of volumes:

Module 1 might use:

vols = {
    data01 = {
        device_name = "xvde"
        size        = "100"
        type        = "gp2"
    }
    data02 = {
        device_name = "xvdh"
        size        = "200"
        type        = "gp2"
    }
}

and Module 2 might use:

vols = {
    output01 = {
        device_name = "xvde"
        size        = "800"
        type        = "gp2"
    }
}

Currently, this is my file which declares 1 instance, 2 volumes and 2 volume_attachments. It will create a single instance with 2 volumes attached (and will be the basis of the inside of my new module):

resource "aws_instance" "storage" {
    ami                     = data.aws_ami.storage.id
    ...
    user_data               = data.template_file.userdata_executescript.rendered
    root_block_device {
        volume_type           = "gp2"
        volume_size           = 80
        delete_on_termination = true
    }

    tags = merge(
        var.common_tags,
        {
        "Name"   = format("%s_storage", var.instance_name)
        "Role"   = format("%s_storage", var.stack_name)
        "Backup" = "true"
        },
    )
}

resource "aws_ebs_volume" "volume_input" {
    availability_zone = local.availability_zone
    size              = 100
    type              = "gp2"
    encrypted         = true

    tags = merge(
        var.common_tags,
        {
        "Name"   = format("%s_storage-%s", var.stack_name, "data01")
        "Backup" = "true"
        },
    )
}

resource "aws_ebs_volume" "volume_input__backup" {
    availability_zone = local.availability_zone
    size              = 200
    type              = "gp2"
    encrypted         = true
    tags = merge(
        var.common_tags,
        {
        "Name"   = format("%s_storage-%s", var.stack_name, "data02")
        "Backup" = "true"
        },
    )
}

resource "aws_volume_attachment" "volume_input" {
    device_name = "xvde"
    volume_id   = aws_ebs_volume.volume_input.id
    instance_id = aws_instance.storage.id
}

resource "aws_volume_attachment" "volume_input__backup" {
    device_name = "xvdh"
    volume_id   = aws_ebs_volume.volume_input__backup.id
    instance_id = aws_instance.storage.id
}

As there will be a differing number of volumes and volume attachments each time the module is invoked, I thought I’d do this using a for_each to declare the volume and volume_attachment resources (at the top of each resource I’ve put the common values, then the ‘each’ values beneath where they differ per resource):

resource "aws_ebs_volume" "volume" {
    for_each = var.vols

    availability_zone = local.availability_zone
    encrypted         = true

    size = each.value.size
    type = each.value.type

    tags = merge(
        var.common_tags,
        {
        "Name"   = format("%s_storage-%s", var.stack_name, each.key)
        "Backup" = "true"
        },
    )
}

resource "aws_volume_attachment" "volume" {
    for_each = var.vols

    instance_id = aws_instance.storage.id

    device_name = each.value.device_name
    volume_id   = aws_ebs_volume.volume["${lookup(var.vols[each.key], each.key, "")}"].id
}

What I need help with is understanding how to specify the volume_id in the aws_volume_attachment resource. I’ve made an attempt at it so you can see the type of value I am trying to lookup/get.

Once I’ve got this set of declarations working, I also need help understanding how to output the volume details into the userdata. I’m using EC2Launch v2 for Windows, which takes yaml input. Essentially, I need to create a userdata file which contains something like this, but with a different number of devices each time:

version: 1.0
tasks:
  - task: initializeVolume
    inputs:
      initialize: devices
      devices:
        - device: xvde
          name: data01
          letter: D
          partition: gpt
        - device: xvdh
          name: data02
          letter: E
          partition: gpt

I’m new to the for/for_each concepts in terraform and can’t seem to figure out the correct syntaxes and formats for repeating elements. Help is appreciated.

For the second part of my query, I’d conceptually like to do something like the following:

variable "vols" {
    type = map(map(string))
    default = {
        "data01" = {
            device_name    = "xvde"
            size           = "100"
            type           = "gp2"
            drive_letter   = "D"
            partition_type = "gpt"
        }
        "data02" = {
            device_name    = "xvdh"
            size           = "200"
            type           = "gp2"
            drive_letter   = "E"
            partition_type = "gpt"
        }
    }
}

data "template_file" "userdata_initializevolume" {
    for_each = var.vols

    template = file(format("%s/UserData_initializeVolume.tpl", path.module))
    vars = {
        device_id      = each.value.device_name
        volume_label   = each.key
        drive_letter   = each.value.drive_letter
        partition_type = each.value.partition_type
    }
}

resource "aws_instance" "storage" {
    ...
    user_data = format("%s%s%s",
        file(format("%s/UserData_header.yaml", path.module)),
        flatten(data.template_file.userdata_initializevolume[*].rendered),
        data.template_file.userdata_executescript.rendered
    )
    ...
}

where UserData_initializeVolume.tpl is a template file containing:

    - device: ${device_id}
      name: ${volume_label}
      letter: ${drive_letter}
      partition: ${partition_type}

and then I jut flatten/concatenate all the instances of that template together into one.

For the volume_attachment, the answer turned out to be simply:

volume_id   = aws_ebs_volume.volume[each.key].id

The final part still unanswered is: how do I get the values from the rendered template_file all into one concatenated string?

The templates are correctly created, I just need to get the values of all of the “rendered” properties into one string. My statefile looks like this:

{
  "module": "module.storage.module.primary",
  "mode": "data",
  "type": "template_file",
  "name": "userdata_initializevolume_test",
  "each": "map",
  "provider": "provider.template",
  "instances": [
    {
      "index_key": "data01",
      "schema_version": 0,
      "attributes": {
        "filename": null,
        "id": "3be663f974c41",
        "rendered": "        - device: xvde\n          name: data01\n          letter: D\n          partition: gpt\n",
        "template": "        - device: ${device_id}\n          name: ${volume_label}\n          letter: ${drive_letter}\n          partition: ${partition_type}\n",
        "vars": {
          "device_id": "xvde",
          "drive_letter": "D",
          "partition_type": "gpt",
          "volume_label": "data01"
        }
      }
    },
    {
      "index_key": "data02",
      "schema_version": 0,
      "attributes": {
        "filename": null,
        "id": "867c9186f76f",
        "rendered": "        - device: xvdh\n          name: data02\n          letter: E\n          partition: gpt\n",
        "template": "        - device: ${device_id}\n          name: ${volume_label}\n          letter: ${drive_letter}\n          partition: ${partition_type}\n",
        "vars": {
          "device_id": "xvdh",
          "drive_letter": "E",
          "partition_type": "gpt",
          "volume_label": "data02"
        }
      }
    }
  ]
},

and what I need to feed it into is something like this:

user_data = format("%s%s%s",
    file(format("%s/UserData_header.yaml", path.module)),
    <rendered strings for all templates go here>,
    data.template_file.userdata_executescript.rendered
)

Hi @rba1-source!

I’m glad you found a working answer for your references from volume attachments to volumes. Another approach that might’ve worked here, which I’m sharing only in case it’s useful for future work, is to use the other resource value as the for_each:

resource "aws_volume_attachment" "volume" {
  for_each = aws_ebs_volume.volume

  # ...
}

That approach means that each.value inside this block would refer to an EBS volume object rather than an object from your input variable. Whether that’s appropriate depends on whether you can get all of the information you need from the other resource’s result, which doesn’t seem to be true in this case: there isn’t an equivalent to each.value.device_name if each.value were the EBS volume. So what you did here, of cross-referencing into the other resource by key, is a fine approach that gives you access to both sets of values.


Moving on to the user_data part of your question, I first want to note that because you are using Terraform 0.12 it’s better to use the templatefile function rather than the template_file data source. The data source continues to exist primarily for backward compatibility with configurations that were originally written for Terraform 0.11 and earlier, as noted in its documentation.

A key advantage of the templatefile function is that, because it’s built in to the Terraform language rather than a separate provider, it can accept values of any type the Terraform language supports, including your list of objects in var.vols. That means you can generate the whole user_data result in a single call to templatefile if we pass that list of objects to it:

  user_data = templatefile("${path.module}/userdata.yaml.tmpl", {
    vols = var.vols
  })

Since you are generating YAML here, we can follow the advice in the templatefile documentation by calling either the yamlencode or jsonencode functions inside the template, rather than trying to construct valid YAML via string concatenation:

${jsonencode({
  version = "1.0"
  tasks = [
    {
      task = "initializeVolume"
      inputs = {
        initialize = "devices"
        devices = [
          for k, v in vols : {
            device    = v.device_name
            name      = k
            letter    = v.drive_letter
            partition = v.partition_type
          }
        ]
      }
    },
  ]
})}

The above uses a for expression to transform the map of objects given as vols into a list of objects that will produce the data structure you described once encoded as JSON or YAML.

I used jsonencode rather than yamlencode here because YAML is a superset of JSON and so a valid YAML parser should always be able to parse JSON too, and Terraform’s JSON serializer is committed not to change its results in future releases while the YAML serializer is marked as experimental. However, you could use yamlencode instead if using the YAML indent-based syntax is important to you, at the expense of the exact format potentially changing in a future version of Terraform and thus causing your instances to be needlessly replaced.

Hi @apparentlymart!

Thanks so much for your response. Your suggestion of using “for_each = aws_ebs_volume.volume” is really clever, because it means I’m guaranteed to create the attachments based only on which volumes were created. Very neat! Shame I can’t use it in this case because the resource doesn’t have the output properties that I need, but I have a feeling I can definitely reuse this idea elsewhere in my module, so that was really useful.

I’ve tried using templatefile instead (yes, the problem of updating from an older terraform version is that you don’t always see all the new functions). It’s given me valid json and yaml from your example, which means I’m almost there. The last bit I can’t get is adding my powershell script payload into the templatefile. I can pass my extra variables in just fine:

user_data = templatefile("${path.module}/userdata.yaml.tmpl", {
    vols          = var.vols
    domain        = var.domain
    admin_group   = var.admin_group
    ...
})

but I can’t seem to get my powershell template into the template in the same way as the templatefile example (I get an “Invalid character; This character is not used within the language.” error when it hits the dollar symbol in front of “$logfile”):

${jsonencode({
version = "1.0"
tasks = [
    {
        task = "initializeVolume"
        inputs = {
            initialize = "devices"
            devices = [
                for k, v in vols : {
                    device    = v.device_id
                    name      = k
                    letter    = v.drive_letter
                    partition = v.partition_type
                }
            ]
        }
    },
    {
        task = "executeScript"
        inputs = {
            frequency = "always"
            type = "powershell"
            runAs = "localSystem"
            content = {
                $logfile = "C:\Windows\Temp\userdata.log"
                "Transcript started {0}" -f (Get-Date).DateTime | Out-File $logfile -Append
                "Joining domain {0} with admingroup {1}" -f ${domain},${admin_group} | Out-File $logfile -Append
                ...

If this were yaml, the “content” line is defined as “content: |-” and then I just paste my whole powershell into the lines following this.

Essentially, I need to know: how do I inject the values from the map into the powershell section of the templatefile? And, do I need to escape any characters in my powershell, and how do I do this ensuring that the map values are represented and also variables internal to the powershell (i.e. “${domain}” should have the value passed in from the map, and “$logfile” should be treated as an internal powershell variable)? I couldn’t find these questions answered in the jsonencode/templatefile docs.

Thanks.

I’ve spent most of today trying out different jsonencode, yamlencode and template combinations.

The only way I can get it to work with AWS is if I have 2 templatefiles: 1 for the yaml header and the volume definitions, and 1 for the powershell. If I use the jsonencode/yamlencode functions, it does generate valid yaml/json, but the AWS yaml parser doesn’t like it and fails. So I will have to use something more basic, just using templatefiles.

I tried the following as file userdata_initializevolume.yaml.tmpl:

version: 1.0
tasks:
  - task: initializeVolume
    inputs:
      initialize: devices
      devices:
      %{ for k, v in vols ~}
        - device: v.device_id
          name: k
          letter: v.drive_letter
          partition: v.partition_type
      %{ endfor ~}

and then used this in my .tf file:

user_data = format("%s%s",
  templatefile("${path.module}/userdata_initializevolume.yaml.tmpl", {
     vols = var.vols
  }),
  templatefile("${path.module}/UserData_executeScript.tpl", {
    domain = var.domain
    ..
  })
)

and whilst it parses the UserData_executeScript.tpl just fine, putting the values in from the map and concatenating both templatefiles into one userdata string including all my powershell (using the template like the above means I can keep my powershell intact, like a heredoc, and not need to format it or escape it in any way, which answers my previous question), for the output from the userdata_initializevolume section it gave me:

version: 1.0
tasks:
  - task: initializeVolume
    inputs:
      initialize: devices
      devices:
              - device: v.device_id
          name: k
          letter: v.drive_letter
          partition: v.partition_type
              - device: v.device_id
          name: k
          letter: v.drive_letter
          partition: v.partition_type

which isn’t correct. How do I use the for loop successfully in the template if I need to keep the text with the specified indentation?

Thanks.

Got it:

version: 1.0
tasks:
  - task: initializeVolume
    inputs:
      initialize: devices
      devices:
%{ for k, v in vols ~}
        - device: ${v.device_id}
          name: ${k}
          letter: ${v.drive_letter}
          partition: ${v.partition_type}
%{ endfor ~}

Removing the spaces from the start of the “%{” put the next line into the correct indentation, and specifying them as vars generated the correct output:

version: 1.0
tasks:
  - task: initializeVolume
    inputs:
      initialize: devices
      devices:
        - device: xvde
          name: data01
          letter: D
          partition: gpt
        - device: xvdh
          name: data02
          letter: E
          partition: gpt

Thanks for all the help!

Hi @rba1-source,

If I’m understanding correctly, you’re saying that for this content property you would typically use one of the YAML multi-line string syntaxes to declare the script as a single string in your YAML.

If that’s a correct understanding then the Terraform language equivalent of the multi-line string syntax is the “heredoc” string syntax described in String Literals, which is inspired by Unix shell multi-line string syntax. Taking the example you showed I think this would be the way to write it in Terraform:

{
  content = <<-END
    $logfile = "C:\Windows\Temp\userdata.log"
    "Transcript started {0}" -f (Get-Date).DateTime | Out-File $logfile -Append
    "Joining domain {0} with admingroup {1}" -f ${domain},${admin_group} | Out-File $logfile -Append
    ...
  END
}

The “flush heredoc” syntax, with the - in <<-END, is largely equivalent to YAML’s |+ string introducer, in that it will trim off whatever leading literal whitespace all of the lines have in common and will keep all newlines including ones at the end of the sequence. The END in the introducer matches with the END marker to delimit the nested template.

If the PowerShell script is long and it only needs to appear once regardless of the number of volumes, I might suggest putting that part in its own file and rendering this in two passes, so that the PowerShell template can be read and maintained separately from the YAML data structure it’s embedded in. For example:

user_data = templatefile("${path.module}/userdata.yaml.tmpl", {
    vols        = var.vols
    boot_script = templatefile("${path.module}/boot.ps1.tmpl", {
      domain        = var.domain
      admin_group   = var.admin_group
      ...
    })
})

If you use this multi-template technique then your content attribute in the user data template would then just be a direct reference to the already-rendered script in boot_script:

{
  content = boot_script
}

Hi @apparentlymart,

The way I solved it in the end was simply to not use jsonencode/yamlencode, and use straight templatefile text files instead, where it just replaced the string variables from my map. This allowed me to keep the powershell code unescaped in its original form.

All my original questions have been answered now, and I have this working.

Thanks for all the help!