Packer 1.6 + Ansible 2.9 failing on gather_facts with ansible provisioner

Hi, I’m struggling to get consistent builds working with the ansible provisioner. Specifically, I get failure messages at the gather_facts step, where the AnsiballZ_setup.py file doesn’t get transferred properly.

Example ami.json:

{
  "variables": {
    "access_key": "{{env `AWS_ACCESS_KEY_ID`}}",
    "secret_key": "{{env `AWS_SECRET_ACCESS_KEY`}}",
    "home": "{{env `HOME`}}"
  },
  "builders": [
    {
      "type": "amazon-ebs",

      "access_key": "{{user `access_key`}}",
      "secret_key": "{{user `secret_key`}}",

      "associate_public_ip_address": true,
      "ssh_username": "{{user `ssh_username`}}",
      "ssh_keypair_name": "{{user `ssh_keypair_name`}}",
      "ssh_private_key_file": "{{user `home`}}/.ssh/{{user `ssh_keypair_name`}}.pem",
      "ssh_timeout": "{{user `ssh_timeout`}}",

      "instance_type": "{{user `instance_type`}}",

      "ami_name": "{{user `environment`}}_{{user `name`}}_{{isotime \"2006_01_02_03_04\"}}",
      "region": "{{user `region`}}",
      "vpc_id": "{{user `vpc_id`}}",
      "subnet_id": "{{user `subnet_id`}}",
      "source_ami_filter": {
        "filters": {
          "name": "*{{user `source_ami_name`}}*",
          "tag:environment": "{{user `environment`}}"
        },
        "owners": ["{{user `aws_account_id`}}"],
        "most_recent": true
      },

      "ami_block_device_mappings": [
        {
          "device_name": "/dev/sda1",
          "volume_size": 16,
          "volume_type": "gp2",
          "delete_on_termination": true
        }
      ],
      "tags": {
        "Name": "{{user `environment`}}_{{user `name`}}",
        "environment": "{{user `environment`}}"
      }
    }
  ],
  "provisioners": [
    {
      "type": "ansible",
      "extra_arguments": [ "--extra-vars", "env={{user `environment`}}",  "--ssh-extra-args", "-o IdentitiesOnly=yes" ],
      "ansible_env_vars": [
        "ANSIBLE_CONFIG={{user `ansible_config`}}",
        "ansible_scp_if_ssh=False"
      ],
      "playbook_file": "{{user `ansible_playbook_path`}}/{{user `name`}}.yml"
    }
  ],
  "post-processors": [
    {
      "type": "manifest",
      "output": "{{user `manifest_path`}}"
    }
  ]
}

Packer log:

2020/11/29 11:01:14 packer-builder-amazon-ebs plugin: [ERROR] ssh session open error: 'read tcp 172.16.1.203:50629->35.175.108.43:22: read: connection reset by peer', attempting reconnect
2020/11/29 11:01:14 packer-builder-amazon-ebs plugin: [DEBUG] reconnecting to TCP connection for SSH
2020/11/29 11:01:14 packer-builder-amazon-ebs plugin: [DEBUG] handshaking with SSH
2020/11/29 11:01:14 packer-builder-amazon-ebs plugin: [DEBUG] handshake complete!
2020/11/29 11:01:14 packer-builder-amazon-ebs plugin: [DEBUG] Opening new ssh session
2020/11/29 11:01:15 packer-builder-amazon-ebs plugin: [INFO] agent forwarding enabled
2020/11/29 11:01:15 packer-builder-amazon-ebs plugin: [DEBUG] starting remote command: /bin/sh -c 'rm -f -r '"'"'~davidzausner/.ansible/tmp/ansible-tmp-1606665673.391375-39745-182640655033499/'"'"' > /dev/null 2>&1 && sleep 0'
2020/11/29 11:01:15 packer-builder-amazon-ebs plugin: [INFO] RPC endpoint: Communicator ended with: 0
2020/11/29 11:01:15 [INFO] 0 bytes written for 'stderr'
2020/11/29 11:01:15 [INFO] 0 bytes written for 'stdout'
2020/11/29 11:01:15 [INFO] RPC client: Communicator ended with: 0
2020/11/29 11:01:15 [INFO] RPC endpoint: Communicator ended with: 0
2020/11/29 11:01:15 [INFO] 0 bytes written for 'stdin'
2020/11/29 11:01:15 packer-provisioner-ansible plugin: [INFO] 0 bytes written for 'stdout'
2020/11/29 11:01:15 packer-provisioner-ansible plugin: [INFO] 0 bytes written for 'stderr'
2020/11/29 11:01:15 packer-provisioner-ansible plugin: [INFO] RPC client: Communicator ended with: 0
2020/11/29 11:01:15 packer-provisioner-ansible plugin: [INFO] 0 bytes written for 'stdin'
2020/11/29 11:01:15 1606665675,,ui,message,    amazon-ebs: fatal: [default]: FAILED! => {"msg": "failed to transfer file to /Users/davidzausner/.ansible/tmp/ansible-local-394918qzohx7e/tmpai5sk9l3 ~davidzausner/.ansible/tmp/ansible-tmp-1606665673.391375-39745-182640655033499/AnsiballZ_setup.py:\n\n\n"}
2020/11/29 11:01:15 1606665675,,ui,message,    amazon-ebs:
2020/11/29 11:01:15 1606665675,,ui,message,    amazon-ebs: PLAY RECAP *********************************************************************
2020/11/29 11:01:15 1606665675,,ui,message,    amazon-ebs: default                    : ok=0    changed=0    unreachable=0    failed=1    skipped=0    rescued=0    ignored=0
2020/11/29 11:01:15 1606665675,,ui,message,    amazon-ebs: localhost                  : ok=5    changed=1    unreachable=0    failed=0    skipped=1    rescued=0    ignored=0
2020/11/29 11:01:15 1606665675,,ui,message,    amazon-ebs:
2020/11/29 11:01:15 packer-provisioner-ansible plugin: shutting down the SSH proxy
2020/11/29 11:01:15 [INFO] (telemetry) ending ansible
2020/11/29 11:01:15 1606665675,,ui,say,==> amazon-ebs: Provisioning step had errors: Running the cleanup provisioner%!(PACKER_COMMA) if present...

Things I’ve tried

  • Setting higher timeouts ControlMaster=auto -o ControlPersist=30m in the packer json file
  • Setting gather_facts with parallel: False
  • Using setup instead of gather_facts
  • Adding an ansible task to delete tmp files between builds
  • Setting ansible_python_interpreter in the packer json files, and in the ansible.cfg level
  • Setting ansible_scp_if_ssh=False in the packer json file. Also setting scp_if_ssh to False in the ansible.cfg file

What is most odd is that this error will happen many times in a row, then not at all for a few runs, and then again. When building a single AMI, this error will come up sparingly. However, when building multiple AMIs (up to 5), this error will come up much more frequently. Any help with this would be really appreciated, thanks!

1 Like

What is your reasoning behind doing this?

Ansible is failing to transfer the setup module which gathers and returns the facts. Setting scp_if_ssh will not use scp, but sftp

When set to smart, Ansible will try them until one succeeds or they all fail. If set to True, it will force ‘scp’, if False it will use ‘sftp’.

The problem seems clear:

{ 
  "msg": "failed to transfer file to /Users/davidzausner/.ansible/tmp/ansible-local-394918qzohx7e/tmpai5sk9l3 ~davidzausner/.ansible/tmp/ansible-tmp-1606665673.391375-39745-182640655033499/AnsiballZ_setup.py:\n\n\n"
}

Failure to transfer the module payload is due to:

  • underlying transport protocol (sftp/scp) failed (are the correct ports open on the security group created) – seems unlikely given how fast it failed
  • permission denied to the user connecting and writing the transferred file

I suspect the remote image. Perhaps the user is not present?
Could you share details about the AMI you’re starting from and the playbook you’re using?