Nomad won't run tas as another user

Dear Nomad community,

I have a nomad + consul cluster and trying to run the following raw_exec job

variable "datacenter" {
  type    = string
}

job "munge" {
  priority = 95 # 100 is higher priority
  datacenters = ["${var.datacenter}"]
  type = "system"
  group "munge" {

    restart {
      attempts = 50
      delay    = "15s"
      interval = "30m"
      mode     = "fail"
    }

    task "setup" {
      lifecycle {
        hook    = "prestart"
        sidecar = false
      }
      driver = "raw_exec"
      user   = "root"
      config {
        command = "bash"
        args    = ["-c", "zypper install --no-confirm --no-recommends munge munge-libs munge-devel && mkdir -p /var/run/munge && chown -R munge:munge /var/run/munge && chmod -R 0755 /var/run/munge"]
      }
    }

    task "munge" {
      driver = "raw_exec"
      user = "munge"
      config {
        command = "/usr/sbin/munged"
        args = ["--foreground"]
      }
    }

    task "remove" {
      lifecycle {
        hook    = "poststop"
        sidecar = false
      }
      driver = "raw_exec"
      user   = "root"
      config {
        command = "bash"
        args    = ["-c", "zypper remove --no-confirm munge munge-libs munge-devel ; rm -rf /var/run/munge ; "]
      }
    }
  }
}

My problem is with the task “munge” which will only run on Nomad/Consul servers, nomad agents fail to run this task with the following error:

failed to launch command with executor: rpc error: code = Unknown desc = failed to start command path=“/usr/sbin/munged” — args=[“/usr/sbin/munged” “–foreground”]: fork/exec /usr/sbin/munged: permission denied

but surprisingly, I can run the same command when I ssh into the node as:

sudo su munge /usr/sbin/munged --foreground

Any idea of what could be wrong?

thank you

@masuberu what are the actual permissions on that executable? And what groups does the service user belong to?

here:

$ ls -alsh /usr/sbin/munged
116K -rwxr-xr-x 3 munge munge 113K Apr 28  2021 /usr/sbin/munged

Hi @masuberu sorry for the slow reply. I may have found the root cause, which is that you likely followed the Nomad production hardening guide and set the Nomad client’s data dir have permissions 0700 and owned by root. The raw_exec driver is not able to run a task as any user besides root in this situation (though that was not the intent).

Not sure what the fix is yet.

Whelp just to follow up, if you’re following the production hardening guide, then this is working as intended. The raw_exec driver is incompatible with how the Nomad client’s data directory is permissioned. I created docs: clarify using user on raw_exec driver by shoenig · Pull Request #17897 · hashicorp/nomad · GitHub to clarify as much in the raw_exec driver documentation.