Use azure-chroot Builder for LVM-enabled Boot Device

We’ve been using Packer to build AWS Machine Images and Azure VM Templates for the best part of a decade — the aws builders since 2016 and the Azure builders since 2017. The generic method we use across CSPs is to “recycle” the build-VM’s boot-disk to be the target disk. This worked for years by using the amazon-ebs builder for creating Amzaon Machine Images. Similarly, this worked for years using the azure-arm builder for creating Azure VM-templates. Both required some “pre” scripts to perform a pivot_root and to wholly free up and blank the boot-disk. We were able to use this method to create AMIs and VM-templates for RHEL 6, 7 and 8 (and derivatives).

Unfortunately, that all came to a screaming halt with RHEL 9 (and derivatives). Red Hat foisted onto the world a distro that seems to have a kernel bug that doesn’t allow any partition that the hosts the / filesystem to be freed up so that the boot-disk can be recycled. Interestingly, RHEL 10 (and derivatives) don’t seem to have that bug so, it may be less a Red Hat issue than that the kernel Red Hat settled on for EL 9 had a regression that went away with the kernel that Red Hat moved to for RHEL 10.

Due to peculiarities in AWS and how RHUI entitlelments are managed for pay-as-you-go EC2s, this forced us to move to the ebssurrogate builder for RHEL 9 builds specifically; however, we opted to refactor to its use for all of our RHEL 7/8/9 and derivative builds to minimize the config-sprawl in our automation-definitions.

RHUI entitlements notionally do not have the same license-enforcement method, so we were trying to move to the azure-chroot builder. Unfortunately, it feels like I’ve run into a capability-gap in the azure-chroot builder. Specifically, it seems that the builder doesn’t support building VM-templates with LVM-enabled boot devices. Any thime I try to run a build, it fails with errors like:

==> azure-chroot.minimal-rhel-9-image: Pausing after run of step 'StepPreMountCommands'. Press enter to continue.
==> azure-chroot.minimal-rhel-9-image: Mounting the root device...
==> azure-chroot.minimal-rhel-9-image: error mounting root volume: exit status 32
==> azure-chroot.minimal-rhel-9-image: Stderr: mount: /mnt/packer-azure-chroot-disks/sdc: wrong fs type, bad option, bad superblock on /dev/sdc1, missing codepage or helper program, or other error.
==> azure-chroot.minimal-rhel-9-image:
==> azure-chroot.minimal-rhel-9-image: error mounting root volume: exit status 32
==> azure-chroot.minimal-rhel-9-image: Stderr: mount: /mnt/packer-azure-chroot-disks/sdc: wrong fs type, bad option, bad superblock on /dev/sdc1, missing codepage or helper program, or other error.
==> azure-chroot.minimal-rhel-9-image:
==> azure-chroot.minimal-rhel-9-image: Step "StepMountDevice" failed
==> azure-chroot.minimal-rhel-9-image: [c] Clean up and exit, [a] abort without cleanup, or [r] retry step (build may fail even if retry succeeds)?

I’d hoped that setting mount_partition to the root-LVM’s /dev/mapper… path would let me work past this (and that I could just use chroot_mounts to mount my LVM volumes, instead. That didn’t work.

The next thing I tried was setting the mount_partition value to "", null or None. No success with any such settings: setting to the /dev/mapper… path created invalid format types of error methods — as did the null or None attempts — and setting it to an empty string just resulted in the bare {{.Device}} value being used (and getting errors related to it being an invalid device).

So, is there a method for using the azure-chroot builder the way I am trying, or is there currently a capability-gap?