AWS User Data with Multiple Files using Templatefile

The templatefile function is meant to replace the template_file data source, but doesn’t seem to give you a way of processing multiple files at once and I am not sure how to get around this. I can do something similar with the older method in combination with the template_cloudinit_config data source, which will take multiple input files, and stitch them together. How do I accomplish something similar with templatefile?

Here is an example what I am trying to accomplish.

resource "aws_instance" "server" {
  for_each = var.servers

  ami = each.value.ami_id
  instance_type = each.value.instance_type
  availability_zone = each.value.availability_zone
  subnet_id = each.value.subnet_id
  vpc_security_group_ids = each.value.security_group_ids
  key_name = var.key_name

  user_data = templatefiles(
    [
      "${path.root}/scripts/common.sh",
      "${path.module}/scripts/configure.sh"
    ],
    {
      hostname = "${each.value.environment}-server-${each.value.index}.${each.value.domain}",
      setting_1 = each.value.setting_1,
      setting_2 = each.value.setting_2,
      ...
    }
  )

  tags = {
    Name = "${each.value.environment}-server-${each.value.index}"
  }
}

In this case, I only know the hostname when the EC2 instance is created, hence the use of the templatefile function over the template_cloudinit_config data source. However, templatefile only takes a single file, so I am not sure how I am suppose to leverage the ability to merge multiple files into a singular entity that template_cloudinit_config provides, while also being able to interpolate the values at creation time that templatefile provides. It is pretty common to have multiple configuration scripts that you want to merge and pass to Cloud-init via user_data so I suspect this issue has come up already but I can’t seem to find a solution anywhere.

Any ideas?

2 Likes

Hi @romasi,

The cloudinit_config data source would be the most robust way to combine together multiple results, and I think it should be possible to make it work; if you can share what you originally tried I might be able to advise on how to connect it with your multiple EC2 instances.

It sounds like you previously had an answer to this with the template_file data source. I’m not sure I fully understand what you want to achieve here but if you do have an example with that data source and could share it then I’m happy to try to translate what you have into an equivalent expression using templatefile function calls.

Thanks @apparentlymart. In short, I am trying to pass multiple setup scripts to the EC2 instance and interpolating variables into them at creation time. Per the documentation, I should be using templatefile instead of template_file. Both of these do not support merging files. I have to leverage cloudinit_config (or template_cloudinit_config) along with template_file to accomplish this. I am trying to find a way to do this using templatefile instead of template_file since I only know some variables at EC2 creation time.

I was able to make this work by creating a merge_files module (below) and just calling that, but it is a bit hacky.

// modules/helper/merge_files/main.tf

variable "files" {
  type = list(string)
}

locals {
  files = {
    for index, file in var.files: index => file
  }
}

data "local_file" "files" {
  for_each = local.files
  filename = each.value
}

data "template_cloudinit_config" "merged" {
  gzip          = false
  base64_encode = false

  dynamic "part" {
    for_each = data.local_file.files

    content {
      filename     = "${part.key}_${basename(part.value.filename)}"
      content_type = "text/x-shellscript" 
      content      = "${part.value.content}"
    }
  }
}

resource "local_file" "merged" {
  content = data.template_cloudinit_config.merged.rendered
  filename = "${path.root}/.terraform/tmp/merged_${timestamp()}.txt"
}

output "filename" {
  value = local_file.merged.filename
}
// main.tf

module "merged_file" {
  source = "./modules/helper/merge_files"
  files = [
    "${path.root}/scripts/common.sh",
    "${path.module}/scripts/configure.sh"
  ]
}

resource "aws_instance" "server" {
  for_each = var.servers

  ami = each.value.ami_id
  instance_type = each.value.instance_type
  availability_zone = each.value.availability_zone
  subnet_id = each.value.subnet_id
  vpc_security_group_ids = each.value.security_group_ids
  key_name = var.key_name

  user_data = templatefile(module.merged_file.filename, { 
    hostname = "${each.value.environment}-server-${each.value.index}.${each.value.domain}"
  })

  tags = {
    Name = "${each.value.environment}-server-${each.value.index}"
  }
}

Per your reply here, I see that you recommend not using user_data to pass in setup scripts, but rather to use Packer to build those into the base image initially so that you can just call them via user_data instead. I do like this idea, and currently use Packer now for the images, but it can be quite a bit inefficient to have to rebuild them for every setup script change, especially if there are multiple layers of image dependencies (e.g. base-image which is used for service images, etc). I guess a possible solution would be to deploy shared scripts to a network filesystem (e.g. EFS) and just mount them on EC2 creation to gain both immutable infrastructure and access to updated scripts at the same time.

Hi @romasi!

I think you have all of the same building blocks I would’ve used here, and the only modification I would make is to use them in a different order: use templatefile to build the strings to pass to cloudinit_config, rather than using cloudinit_config to dynamically produce a template to render.

For example:

data "cloudinit_config" "example" {
  for_each = var.servers

  part {
    filename     = "common.sh"
    content_type = "text/x-shellscript"
    content = templatefile("${path.root}/scripts/common.sh", {
      hostname = "${each.value.environment}-server-${each.value.index}.${each.value.domain}"
    })
  }
  part {
    filename     = "configure.sh"
    content_type = "text/x-shellscript"
    content = templatefile("${path.module}/scripts/configure.sh", {
      hostname = "${each.value.environment}-server-${each.value.index}.${each.value.domain}"
    })
  }
}

resource "aws_instance" "example" {
  for_each = var.servers

  # ...

  user_data = data.cloudinit_config.example[each.key].rendered
}

By using the same for_each for both data "cloudinit_config" "example" and resource "aws_instance" "example" it’s possible to refer from one to the other using each.key to correlate the matching instances.

1 Like

@apparentlymart, ah, I see. Since we already know the creation information in var.servers, we can just pre-compile that information ahead of time and look up the corresponding details on creation. Thanks again. Much appreciated.