How to share qcow artifact between allocs?

schlumpfit · April 29, 2021, 2:35pm

Hello,
I am looking for some help to share a .qcow2 image downloaded as artifact between multiple allocs.
My use case:

Build using packer and uploaded to artifactory
Download the image to a shared directory on some nomad clients
Start Multiple Jobs with each running a single QEMU image as snapshot

What worked so far:

Add plugin stance to client configuration:

plugin "qemu" {
  config {
    image_paths = ["/srv/nomad/images"]
  }
}

Download image manually to this folder
Use the image

...
image_path = "/srv/nomad/images/packer-focal2004"
...

But if I download the image via the artifact stance from artifactory it is stored in the data_dir/alloc/alloc-id/srv/nomad/images folder

      artifact {
        source = "https://artifactory/path/to/image"
        destination = "/srv/nomad/images/packer-focal2004"
      }

What options do I have to download and share between allocs?

schlumpfit · June 21, 2021, 8:45am

Hello, any suggestions here?

lgfa29 · July 5, 2021, 10:54pm

Hi @schlumpfit

I don’t have a lot of experience with QEMU, so I may be misunderstanding what you are trying to do.

In general Nomad allocations have their own isolated file system, since different allocations are not guaranteed to run in the same host.

The artifact block is also restricted to place files only in the allocation file system to avoid security issues where a job is able to place arbitrary (and potentially malicious) files somewhere in the host file system.

If you want to share data between allocations that are in the same host, you can use host volumes. If the allocations are in different hosts you will need to setup some mechanism to share these files, like NFS for example. If you are running in a cloud environment, you may be able to use CSI as well.

Does this help?

schlumpfit · July 15, 2021, 7:54am

Hi @lgfa29,

I think you understood everything correctly.

I am using a shared drive which is mounted on each node to /srv/nomad/images.
The issue so far was that each client created its own allocation file system. This ended up in the image being downloaded to /srv/nomad/images/clientX/allocs/alloc/image instead of (what I expected) /srv/nomad/images since I added

  config {
    image_paths = ["/srv/nomad/images"]
  }
}

(But this was a false and silly assumption on my side.)

I will follow your advice and dig more into host volumes.

The ideal case for me would be:

Use a shared drive for qemu base images.
Download the base image in case it is not present.
Start qemu from that base, but store the snapshot/overlay image in the alloc file system.
Delete the snapshot once the job is done.
Keep the base image (Which leads to the fact that the shared folder needs to be cleaned up manually from time to time, which should not be too hard as in the ideal case the needed images would just be re-downloaded again)

lgfa29 · July 20, 2021, 11:54pm

I think that this is what you need then (again, I’m not very familiar with QEMU ):

Create a folder in your clients to server as the shared drive, like /srv/nomad/images as you’ve been using (make sure it exists in all clients).

Then add a host_volume block to your clients configuration file:

client {
  # ...
  host_volume "images" {
    path = "/srv/nomad/images"
  }
}

In your job, add the volume and volume_mount:

job "example" {
  # ...
  group "example" {
    # ...
    volume "images" {
      type   = "host"
      source = "images"  # This value must match the volume name in your client config.
    }
    # ...
    task "example" {
      driver = "qemu" 
  
      config {
        image_path = "/srv/nomad/images/..."
        # ...
      }

      volume_mount {
        volume      = "images"  # This value must match the name of your volume in this job.
        destination = "/srv/nomad/images"
      }
    }
  }
}

This is probably the trickiest bit. In theory you could use an artifact to download the images, but due to the order in which artifacts are downloaded and volumes are mounted, the volume will actually mount over the downloaded artifact.

You could maybe write a custom script that runs as a prestart task in your job that checks for the image and downloads it if not available. Not a great solution though

This is where my lack of knowledge of QEMU trips me. You would need to check where the snapshot/overlays are being stored.

If the snapshot is being store in the allocation directory, then this happens automatically (not really when the job is done, but when the Nomad garbage collector runs).

Maybe this could be another custom script running as a system job? The tricky part would be making sure that no active allocation is using the image before removing it.

I hope these help, if not, feel free to ask more questions

schlumpfit · December 8, 2021, 10:55am

Thanks @Igfa29 for your help so far.

After some time I had time and support of a colleague to come back to this issue.

Everything seems to be more complicated than expected:

The qemu task driver has no volume mounting capabilities
As you already mentioned:

Because the task driver mounts the volume, it is not possible to have artifact , template , or dispatch_payload blocks write to a volume.

We can not make use of the exec driver in combination with the host volume since we have a NFS share mounted (for reasons) and nomad fails to chown (when running as root)
Even if we could download the image to the shared directory I am not sure if it is even possible to link the overlay file to the base image, which should be located inside the alloc-dir. My understanding here is:

Shared Storage Layout:

shared_images/
|- base-image-to-be-downloaded.qcow2
clientX/
|- alloc/
||- alloc-id/
|||- overlay-image.qcow2 -> "reads from" ->  base-image-to-be-downloaded.qcow2

Client config:

data_dir: "/mnt/shared_storage/clientX"
client {
  options = {
    "driver.allowlist" = "qemu, exec"
  }
  host_volume "shared_images_volume" {
    path = "/mnt/shared_storage/shared_images/"
  }
}

Job config with exec:

job "download_artifact" {
  datacenters = ["dc1"]

  group "group1" {
    volume "images" {
      type      = "host"
      source    = "shared_images_volume"
      read_only = "false"
    }

    task "download" {
      driver = "exec"

      config {
        command = "/usr/bin/curl"
        args =  ["https://url-to-base-image.qcow2", "-o", "/images/base-image.qcow2"]
      }

      volume_mount {
        volume = "images"
        destination = "/images"
        read_only = "false"
      }
      lifecycle {
        hook    = "prestart"
        sidecar = false
      }
    }
  }
}

Resulting behavior exec task driver:

Task driver starts
shared_images_volume gets mounted into /mnt/shared_storage/clientX/alloc-dir/images
curl download the “base-image-to-be-downloaded.qcow2”
Even if the shared_images_volume is unmounted after the task is finished the “base-image-to-be-downloaded.qcow2” remains accassible under /mnt/shared_storage/shared_images/base-image-to-be-downloaded.qcow2

Resulting behavior artifact stanca

“base-image-to-be-downloaded.qcow2” is downloaded into /mnt/shared_storage/clientX/alloc-dir/images/base-image-to-be-downloaded.qcow2 by the artifact stanca
Once the task driver starts it will mount the shared_images_volume into /mnt/shared_storage/clientX/alloc-dir/images and overwrite the “original” /mnt/shared_storage/clientX/alloc-dir/images with /mnt/shared_storage/clientX/alloc-dir/images <--- shared_images_volume

Remaining question

Is it possible from the Nomad storage layout that files from a “native” alloc directory can symlink/reference files from a mounted volume or from an image_paths like in the case of qemu plugin.

Topic		Replies	Views
Is it possible to use local QEMU image? Nomad	2	441	May 26, 2023
How to share data Nginx between and app container Nomad	5	1081	December 30, 2022
Can host volumes be used as destination for an artifact such that artifact is not downloaded if it already exists Nomad	8	326	May 16, 2024
Simple nomad config for a Nginx and php-fpm app Nomad	2	2883	February 3, 2021
Allocation-specific volumes in Nomad Nomad nomad	0	277	July 26, 2023