Ubuntu ova template restarts before packer finishes provisioning

lethargosapatheia · June 14, 2023, 2:01pm

This isn’t strictly a packer issue, but I’ll ask anyway, maybe someone he’s come across this issue.

I’m using vsphere with ova Ubuntu templates to provision virtual machines (Ubuntu 22.04 LTS (Jammy Jellyfish) daily [20230615]). With these templates the virtual machines reboots after the cloudinit changes have been applied, so there is an intermediary stage where the virtual machine boots up with (I’m guessing) all the normal services, among which the ssh server also.

This is, of course, identified by packer and packer tries to connect through SSH to the virtual machine and run ansible before the VM reboots, which leads to:

Failed to connect to the host via ssh: ssh: connect to host 10.0.0.1 port 22: Connection refused

I’m not sure what has changed in this latest version of the Ubuntu cloud image (2023-06-13), maybe the ssh server is starting in this intermediary stage when it shouldn’t, I’ll have to check that.

I was wondering if you had any ideas about how I could go about solving this issue.

I’m using packer version 1.7.9 and ansible version 2.14.6

lethargosapatheia · June 14, 2023, 2:56pm

It seems that the problem started with 20230602 (Ubuntu 22.04 LTS (Jammy Jellyfish) daily [20230524])

I’ll try to file a bug report. It’s hard to understand why they would change such an important behaviour while also keeping the documentation scarce.

lethargosapatheia · June 14, 2023, 4:47pm

I’ve asked the question here, in case anyone is interested:

lethargosapatheia · June 15, 2023, 10:50pm

Eventually I’ve filed the bug here: cloud-init changes behaviour with ubuntu cloud image ova starting from version 20230602 · Issue #4188 · canonical/cloud-init · GitHub

lethargosapatheia · June 22, 2023, 11:41am

The discussion around the bug has become stale in the meantime. At this point I’ve no idea how one is supposed to be running packer with the Ubuntu template (ova) anymore

You’re supposed maybe to add a timeout which is entirely dependent on the vagaries of the hypervisor or whatever weird circumstances you might be encountering.

I’ve tried using bootcmd: systemctl stop sshd and runcmd: systemctl start sshd (cloudinit directives), but this adds a huge delay to the deploying, because ssh starts very late – somehow cloudinit seems to be delayed exactly because ssh isn’t starting.

lethargosapatheia · June 26, 2023, 9:43pm

As a temporary (which might as well end up being a permanent) solution I added this to the cloud-init configuration:

bootcmd:
  - [ cloud-init-per, once, systemctl, stop, sshd ]

This ensure that the command runs only once, then the virtual machine reboots and then the ssh will start.
Before I just tried stopping and starting ssh using bootcmd and runcmd (systemctl start/stop sshd), but that wouldn’t work, because bootcmd would run at each boot and because some service depended on ssh itself, or just waiting for it to start (at least that’s my take on it), runcmd would run really late, after 5 minutes or so. So that wasn’t very practical.

In case someone else comes across this, maybe this will help. I wonder a little bit why this isn’t a bigger issue given that Canonical decided to change the template like that, but for many I guess that this layer is covered by public clouds and such.

Topic		Replies	Views
Stop rebooting vSphere virtual machine using packer Packer	0	321	July 28, 2023
Vsphere-clone with Ubuntu OVAs? Packer	0	1182	June 21, 2021
Ubuntu cloud-init - Packer SSH Connecting to Installer Packer	2	1504	August 30, 2021
Vsphere-iso builder get stuck after reboot of VM with i/o dial timeout error Packer	0	44	July 27, 2024
Trouble creating Ubuntu 22.10 template in vSphere Stuck on "Waiting for SSH to become available: Packer	3	863	June 11, 2023

Ubuntu ova template restarts before packer finishes provisioning

Related topics