Good day all.
I’m having some trouble finding the root cause of my problem and I hope somebody here can shed some light on the situation.
My config looks like:
{
"builders": [
{
"type": "amazon-ebs",
"region": "us-east-1",
"ami_name": "{{user `build_id`}}",
"ssh_username": "{{user `vm_username`}}",
"ssh_timeout": "2h",
"ssh_read_write_timeout": "2h",
"ssh_interface": "public_ip",
"communicator": "ssh",
"ami_virtualization_type": "hvm",
"tenancy": "host",
"ebs_optimized": "true",
"instance_type": "mac2.metal",
"vpc_id": "{{ user `vpc_id` }}",
"subnet_id": "{{ user `subnet_id` }}",
"associate_public_ip_address": true
-- SNIP block devices, AMI, tmp IAM --
}
],
"provisioners": [
{
"type": "shell",
"inline": [
"PDISK=$(diskutil list physical external | head -n1 | cut -d' ' -f1)",
"APFSCONT=$(diskutil list physical external | grep Apple_APFS | tr -s ' ' | cut -d' ' -f8)",
"yes | sudo diskutil repairDisk $PDISK",
"sudo diskutil apfs resizeContainer $APFSCONT 0"
]
},
{
"type": "shell",
"execute_command": "chmod +x {{ .Path }}; sudo {{ .Vars }} {{ .Path }}",
"script": "./reboot.sh",
"skip_clean": true,
"expect_disconnect": true
},
{
"type": "shell",
"start_retry_timeout": "1800s",
"inline": "<more commands here>"
}
]
}
Once the reboot script is run, the connection with the server is broken and packer attempts to reconnect. After roughly 5 minutes the packer run fails with:
2022/12/22 12:56:08 [INFO] 38 bytes written for 'uploadData'
2022/12/22 12:56:08 packer-plugin-amazon_v1.1.6_x5.0_linux_amd64 plugin: 2022/12/22 12:56:08 [DEBUG] Opening new ssh session
2022/12/22 12:56:08 packer-plugin-amazon_v1.1.6_x5.0_linux_amd64 plugin: 2022/12/22 12:56:08 [ERROR] ssh session open error: 'client not available', attempting reconnect
2022/12/22 12:56:08 packer-plugin-amazon_v1.1.6_x5.0_linux_amd64 plugin: 2022/12/22 12:56:08 [DEBUG] reconnecting to TCP connection for SSH
2022/12/22 12:56:23 packer-plugin-amazon_v1.1.6_x5.0_linux_amd64 plugin: 2022/12/22 12:56:23 [ERROR] reconnection error: dial tcp 3.88.221.153:22: i/o timeout
2022/12/22 12:56:23 packer-provisioner-shell plugin: Retryable error: Error uploading script: dial tcp 3.88.221.153:22: i/o timeout
2022/12/22 12:56:23 [INFO] (telemetry) ending shell
At which point provisioning stops and the cleanup starts.
As per config has ssh_timeout
, ssh_read_write_timeout
and start_retry_timeout
has values much higher than 5 minutes and I’ve tried pause_before
as well, but it fails after 5 minutes.
I cannot figure out where to extend this wait time as the reboot of mac2.metal machines take 15-30 minutes on AWS.
Any help would be most welcome.
Kind regards
Henti