How to ensure terraform fails if ansible fails

in terraform, local-exec will march on even if a single ansible playbook fails.

I’d like to ensure a local-exec fails immediately with an appropriate exit code as soon as as that happens. I think this should be default behaviour but if you have any ideas I’d love to know.

here is an example-

resource "null_resource" "provision_1" {
  provisioner "local-exec" {
    command = <<EOT
      set -x
      echo 'test'
      ansible-playbook ansible-fail.yml
      echo 'continuing'
EOT

  }
}

resource "null_resource" "provision_2" {
  provisioner "local-exec" {
    command = <<EOT
      set -x
      echo 'continuing as well'
EOT

  }
}

ansible-fail.yml

- hosts: localhost

  tasks:
  - fail:
      msg: enforce failure.

…and the output showing that ‘continuing’ is being printed when it shouldn’t.

terraform apply --auto-approve
null_resource.provision_1: Creating...
null_resource.provision_2: Creating...
null_resource.provision_1: Provisioning with 'local-exec'...
null_resource.provision_2: Provisioning with 'local-exec'...
null_resource.provision_1 (local-exec): Executing: ["/bin/sh" "-c" "      set -x\n      echo 'test'\n      ansible-playbook ansible-fail.yml\n      echo 'continuing'\n"]
null_resource.provision_2 (local-exec): Executing: ["/bin/sh" "-c" "      set -x\n      echo 'continuing as well'\n"]
null_resource.provision_1 (local-exec): + echo test
null_resource.provision_1 (local-exec): test
null_resource.provision_2 (local-exec): + echo 'continuing as well'
null_resource.provision_1 (local-exec): + ansible-playbook ansible-fail.yml
null_resource.provision_2 (local-exec): continuing as well
null_resource.provision_2: Creation complete after 0s [id=76518244843484622]
null_resource.provision_1 (local-exec):  [WARNING]: No inventory was parsed, only implicit localhost is available
null_resource.provision_1 (local-exec):  [WARNING]: provided hosts list is empty, only localhost is available. Note
null_resource.provision_1 (local-exec): that the implicit localhost does not match 'all'

null_resource.provision_1 (local-exec): PLAY [localhost] ***************************************************************

null_resource.provision_1 (local-exec): TASK [Gathering Facts] *********************************************************
null_resource.provision_1 (local-exec): ok: [localhost]

null_resource.provision_1 (local-exec): TASK [fail] ********************************************************************
null_resource.provision_1 (local-exec): fatal: [localhost]: FAILED! => {"changed": false, "msg": "enforce failure."}

null_resource.provision_1 (local-exec): PLAY RECAP *********************************************************************
null_resource.provision_1 (local-exec): localhost                  : ok=1    changed=0    unreachable=0    failed=1    skipped=0    rescued=0    ignored=0

null_resource.provision_1 (local-exec): + echo continuing
null_resource.provision_1 (local-exec): continuing
null_resource.provision_1: Creation complete after 4s [id=5512796884102997803]

Apply complete! Resources: 2 added, 0 changed, 0 destroyed.

I posted this in the github thread https://github.com/hashicorp/terraform/issues/15469
This is my current solution that just emerged
i’ve started to find that using a local-exec with simply

echo "Failed Message" >&2
exit 1

is working well for me.

…extended to ansible in local-exec I’m doing this now-

resource "null_resource" "fail-test" {
  provisioner "local-exec" {
    command = <<EOT
      exit_test () {
        RED='\033[0;31m' # Red Text
        GREEN='\033[0;32m' # Green Text
        BLUE='\033[0;34m' # Blue Text
        NC='\033[0m' # No Color
        if [ $? -eq 0 ]; then
          printf "\n $GREEN Playbook Succeeded $NC \n"
        else
          printf "\n $RED Failed Playbook $NC \n" >&2
          exit 1
        fi
      }
      ansible-playbook ansible-fail.yml; exit_test
  
EOT

  }
}

I think it would be cool if terraform had a mode to detect any non zero exit codes for each line though. It would be useful.

Hi @queglay,

From Terraform’s perspective, the command value is just an opaque string to be passed to the shell, so it’s the shell’s responsibility to decide how to handle errors.

If you are using bash then you should be able to arrange for the behavior you want by adding the following to the start of your script:

set -e

(With that said, do be sure to consider the caveats about -e.)

As you’ve seen, Terraform will halt processing if the result is unsuccessful, but the definition of what is unsuccessful is up to the shell that is processing the commands, not up to Terraform itself. Depending on what you’re running, you might also consider using -o pipefail and other bash options to select behaviors appropriate for what your script is expecting.

1 Like

I’d like to slightly necro this, as I’m struggling even after following the advice on this thread.

I’ve got the following provisioner:

provisioner "local-exec" {
    command    = <<EOT
     if [[ var.region = 'us-west-2' || var.region = 'us-east-2' ]]
     then
      echo Region Valid
     else
      exit 1
     fi
  EOT
  }

Seemingly no matter what I do in the conditional, the terraform run continues on failure. I’ve validated, the conditions, and can conform that the script is definitely exiting 1, yet terraform just keeps on keeping on. Has the default behavior for local-exec changed? I’ve also tried set -e and place a command that will fail in my then block, but still no dice.

What about this?

provisioner "local-exec" {
    command    = <<EOT
     if [[ "${var.region}" == "us-west-2" || "${var.region}" = "us-east-2" ]]
     then
      echo 'Region Valid'
     else
      exit 1
     fi
  EOT
  }
1 Like

Hello guys, i’m here because we are having the same problem doing something like this

provisioner “local-exec” {
command = format(“git clone %s /tmp/%s ; cd /tmp/%s ; git checkout %s ; ansible-playbook %s.yml --extra-vars ‘%s ip=%s’ ; rm -rf /tmp/%s”, var.ansible.repo, local.repo_tmp_dir, local.repo_tmp_dir, var.ansible.branch , var.ansible.playbook, var.ansible_extra_vars, local.private_ip ,local.repo_tmp_dir )
}

In this case, if ansible fails, terraform will keep working and also will tell you that the infra was created OK, we Fixed that behavior doing this

provisioner “local-exec” {
command = format(“git clone %s /tmp/%s && cd /tmp/%s && git checkout %s && ansible-playbook %s.yml --extra-vars ‘%s ip=%s’ && rm -rf /tmp/%s”, var.ansible.repo, local.repo_tmp_dir, local.repo_tmp_dir, var.ansible.branch, var.ansible.playbook, var.ansible_extra_vars, local.private_ip, local.repo_tmp_dir)
}

On this second example, not just terraform will stop executing, it also will taint the “local-exec” resource, so on the next run you don’t need to recreate the infra, just to re run the ansible against your created infra.

Hi @madipietro,

It looks like you already saw that in order for Terraform to consider a local-exec provisioner as “failed” the overall command you run needs to exit with a non-successful status, which you’ve achieved here by chaining the steps together with && so that the first failure will abort the chain.

In a situation where the creation of an object (which includes running provisioners) fails partway through, Terraform can’t tell how far the process got and so as you’ve seen it will plan to destroy the object and create a new one in order to start the process again. Terraform represents the need to do that using the “tainted” status.

If you want Terraform to consider running that command as a separate operation than creating the object the provisioner is embedded within, I think that will mean moving the provisioner out into a separate resource block which can then fail independently of the first one. The hashicorp/null provider has a special resource type null_resource which is intentionally designed to do nothing at all so that you can associate provisioner blocks with it for actions that must happen independently of any “real” resources:

resource "example_virtual_machine" "example" {
  # (configuration of your VM, just as a
  # placeholder because you didn't share that
  # part of your configuration.)
}

resource "null_resource" "example" {
  triggers = {
    # This resource will be re-created, and thus
    # the provisioner re-run, each time the
    # VM's IP address changes. Adjust this to
    # whatever is a suitable attribute to represent
    # the VM being replaced.
    instance_ip_addr = example_virtual_machine.example.private_ip
  }

  provisioner “local-exec” {
    command = format(“git clone %s /tmp/%s && cd /tmp/%s && git checkout %s && ansible-playbook %s.yml --extra-vars ‘%s ip=%s’ && rm -rf /tmp/%s”, var.ansible.repo, local.repo_tmp_dir, local.repo_tmp_dir, var.ansible.branch, var.ansible.playbook, var.ansible_extra_vars, local.private_ip, local.repo_tmp_dir)
  }
}

With this structure, when the provisioner fails Terraform will record that null_resource.example failed, but it will already have considered example_virtual_machine.example to have succeeded. Therefore a subsequent terraform plan will only plan to “replace” null_resource.example (re-run the provisioner) and will leave example_virtual_machine.example unchanged.