How can I ensure a resource is tainted if a null_resource fails

Houlistonm · April 14, 2020, 9:55am

I have a module that launches our AWS hosts and configures via Cloud-Init (CI).

We’ve been experiencing some failures with CI and we initially added a remote-exec provisioner to the instance to wait for CI to finish and the status would cause a pass/fail in TF.

Unfortunately that caused a deadlock, our CI needed the volumes attached to complete, and TF couldn’t attach the volumes while it was waiting on CI to complete.

Our solution was to use a null_resource and this works for all of our use cases with one side-effect.

If the null_resource fails, it doesn’t taint the instance. So the next TF run, the null_resource is re-created and fails because the [broken] host still exists. This causes problems in our pipeline.

Is there a way I can taint the instance if the null_resource fails?

resource "aws_instance" "instance" {
  count = var.instance_count

  ami                  = var.ami_id
  instance_type        = var.instance_type
  user_data            = var.user_data
  iam_instance_profile = var.iam_instance_profile

  // omitted many attributes to save space
}

resource "aws_volume_attachment" "volume_attachment" {
  count        = var.volume_ids == null ? 0 : length(var.volume_ids)
  skip_destroy = true
  instance_id  = element(aws_instance.instance.*.id, count.index)
  volume_id    = element(var.volume_ids, count.index)
  device_name  = var.device_name
}

resource "null_resource" "cloud_init_status" {
  count = var.bot_key_pem != null ? var.instance_count : 0

  triggers = {
    instance_id = element(aws_instance.instance.*.id, count.index)
  }

  provisioner "remote-exec" {
    inline = [
      "echo \"Running cloud-init status --wait > /dev/null\"",
      "sudo cloud-init status --wait > /dev/null",
      "sudo cloud-init status --long"
    ]

    connection {
      user         = "bot"
      host         = element(aws_instance.instance.*.private_ip, count.index)
      timeout      = var.ssh_timeout
      private_key  = var.bot_key_pem
      bastion_host = var.bastion_host
    }
  }
}

Topic		Replies	Views
Null_resource with provisioner remote_exec errors even when the script has exited successfully after failing in the previous run Terraform	10	4781	March 13, 2023
Terraform stucks when instance_count is more than 2 while using remote-exec provisioner Terraform	1	1921	August 14, 2019
How to fail aws_instance creation if user_data fails to run to completion Terraform	2	771	May 31, 2020
Terraform 0.12.16 Deposed resource destruction order versus dependencies Terraform	1	579	December 17, 2019
Conditional null resource Terraform	0	1222	November 18, 2019

How can I ensure a resource is tainted if a null_resource fails

Related topics