Hi, I wanted to run VMs using nomad. I was kind of on a self exploration journey and tried to learn from youtube and any blogs on Nomad. Initially it was a bit struggle, because running Docker containers was cake walk because multiple working solutions was easy to find. But finding a full working solution example for Qemu on Nomad was hard to find and I had unnecessarily troubleshooted problems of missing configuration in my task file for more than a few weeks. Then the material from github account named angrycub was helpful to complete my experiments. But part of the reason I could not solve it easily also was because I have not used qemu as well before. In the meantime since qemu was such a challenge for me, I tried to use vagrant as a provisioner to run VMs instead of QEMU. It was difficult to get it working as well, because of my limitted knowledge of Nomad. Eventually when I tried raw_exec driver and started the vagrant VM inside a bash shell of raw_exec command it continuously got restarted because Vagrant command stopped after starting the VM. I am still really interested if my experiment can be completed and a better version of what I have tried so far can be done. I am pasting below the kind of Nomad task definition I tried,
job “virtualbox3” {
datacenters = [“dc1”]
type = “service”
group “virtualbox3-vm” {
count = 1
network {
port "ssh" {
static = 30023
to = 22
}
}
reschedule {
attempts = 0
unlimited = false
}
restart {
attempts = 1
mode = “fail”
}
service {
name = "virtualbox2"
}
task "create-vm3" {
driver = "raw_exec"
config {
command = "bash"
args = ["./startup.sh", "${NOMAD_PORT_ssh}"]
}
template {
data = <<EOF
#!/bin/bash
echo ‘Vagrant.configure(“2”) do |config|’ > Vagrantfile
echo ’ config.vm.box = “ubuntu/focal64”’ >> Vagrantfile
echo ’ config.vm.hostname = “myfocal.box”’ >> Vagrantfile
#echo ’ config.vm.network :forwarded_port, guest: 22, host: ’ $1 >> Vagrantfile
echo ’ config.vm.network :forwarded_port, guest: 22, host: 30023’ >> Vagrantfile
#echo ’ config.ssh.username = “vagrant”’ >> Vagrantfile
#echo ’ config.ssh.password = “password”’ >> Vagrantfile
#echo ’ config.ssh.insert_key = false’ >> Vagrantfile
#echo ’ config.ssh.keys_only = false’ >> Vagrantfile
echo ‘end’ >> Vagrantfile
vagrant up
while :
do
sleep 10000
done
EOF
destination = "startup.sh"
}
resources {
cpu = 5000
memory = 5120
}
restart {
attempts = 20
}
}
task “vmcleanup” {
lifecycle {
hook = “poststop”
sidecar = false
}
driver = "raw_exec"
config {
command = "bash"
args = ["./cleanup.sh"]
}
template {
data = <<EOF
#!/bin/bash
cd $NOMAD_ALLOC_DIR/…/create-vm3
echo $NOMAD_ALLOC_DIR
echo “destroying the VM”
vagrant destroy --force
EOF
destination = "cleanup.sh"
}
}
}
}
A couple of problems I think the above approach has is
- Nomad cannot monitor the resource usage of the Vagrant Virtual box VM becuase Nomad only sees the bash script running the infinite while loop and should cpu memory utilization as negligible. This could affect Nomad’s job scheduling based on resource availability
- Even if I create workarounds to create VMs in the host network using public network option in Vagrant, Nomad cannot find and give me the IP address of the VM. I felt any other way of getting IP address may not be right way!(But I then could see there is a post start hook that I can run as SSH command to get IP address of the machine). But this affects my requirement to safely get the IP address of the VM that can be reached from rest of my network which i want to use in application
Advantages of the above task definition over QEMU:
- The above configuration could reuse machine images on the same machine already downloaded by vagrant and hence do not have to download artifact over the network. The job configurations I get for qemu always had to download artifact over network and my artifacts are 5 to 10 GB which I felt was potential problem
- Getting a QEMU job run on host network was also not easy because I could not use CNI plugin samples which I only get for docker jobs and I am not even sure if the QEMU Vms could be run in a bridged mode and I could not get a working sample for such an option any where.(But now a see there is a virt manager task driver plugin is available in preview and it could be easier to get VM to run in bridged mode easier in it I guess. but If I get a working procedure for the same, it would be very helpful)