Nomad job run pytechco-redis.nomad.hcl, deployment is never in Successful status

Hello,

I am just starting using Nomad, and following this tutorial.

When issuing the following command

nomad job run pytechco-redis.nomad.hcl

The deployment status is always in progress, even though I waited for more than 10 minutes.

  ⠧ Deployment "0577dcd1" in progress...
    2023-04-14T15:26:10-04:00
    ID          = 0577dcd1
    Job ID      = pytechco-redis
    Job Version = 0
    Status      = running
    Description = Deployment is running

    Deployed
    Task Group  Desired  Placed  Healthy  Unhealthy  Progress Deadline
    ptc-redis   1        0       0        0          N/A

I have the setup configured locally in my MacBook. I have 2 windows term opened, one is for

sudo nomad agent -dev -bind 0.0.0.0

and the other for the other commands such as

export NOMAD_ADDR=http://localhost:4646
nomad job run pytechco-redis.nomad.hcl

Any idea where I should be looking into?

Thank you,
Laurentius

Hey Laurentius,

Sorry you’re hitting this error. This Getting Started guide is pretty new, so we might not have ironed out all the kinks yet.

The easier way to debug something like this is probably by using the Nomad UI. I would pop open the Job page in the UI and see if there are any errors that might be helpful.

Since I see that you have a desired count of 1 and a placed count of 0, that means that Nomad can’t find a place to put the task. If you have an error that says something like “Placement Failures” that might give you more info. Some common reasons would be if you don’t have the required task driver, or you dont have enough space on your computer, or you’ve specified some constraint like that it must be run on Linux.

What I’m guessing may have happened is that you don’t have docker running on your mac. If you see an error like “Constraint missing drivers filtered 1 node”, then that’s what happened. If you just start Docker Desktop, then it should fix itself. If that’s the case, we definitely need to specify that in the guide!!

If not, let me know if you see any other errors, and I can help debug.

Hey hey, thanks for trying out the tutorial! Apologies that it’s not working for you at the moment but we’ll help figure it out!

I think @mnomitch’s advice is right on track as I was able to replicate your issue if I quit Docker Desktop and run the tutorial.

In addition to checking out the Job information in the UI, you can also click on the Clients page from the left navigation, click on the one client (your mac), and scroll down to the Driver Status section. If the docker driver isn’t showing as detected (see screenshot), that will confirm it - start Docker Desktop and you should be good to go!

Additionally, we’ll update the tutorial to mention that Docker is a prerequisite.

Let us know if that helps!

Hello @tonino and @mnomitch ,

I saw that error message.

Does it have to be Docker Desktop (license :heavy_dollar_sign:)? I have Rancher Desktop up and running, but it doesn’t seem to be “picked up” by nomad.

Thank you,
Laurentius

This is the screenshot from the UI with Rancher Desktop running.

Not familiar with Rancher Desktop but from after some googling, it looks like it gives you the option to choose the container runtime, containerd or dockerd, and I’m assuming it uses containerd by default.

Can you check your config and switch to dockerd? Then close down your Nomad cluster, restart Rancher, and your Nomad cluster and see if the client page shows the docker driver then.

Not sure exactly how the inner workings of Nomad operate but if the docker command isn’t available when it starts, I imagine the driver won’t be detected. I think switching to the dockerd engine in Rancher might help.

Thanks @tonino. I’ll give it a try and let you know.

Using containerd or dockerd did not work either. I will look into this harder :grinning:

Once again, thanks for all your help.

Output:

▶ nomad job run pytechco-redis.nomad.hcl
==> 2023-04-14T17:33:53-04:00: Monitoring evaluation "e6419bce"
    2023-04-14T17:33:53-04:00: Evaluation triggered by job "pytechco-redis"
    2023-04-14T17:33:54-04:00: Evaluation within deployment: "f0c50e0c"
    2023-04-14T17:33:54-04:00: Evaluation status changed: "pending" -> "complete"
==> 2023-04-14T17:33:54-04:00: Evaluation "e6419bce" finished with status "complete" but failed to place all allocations:
    2023-04-14T17:33:54-04:00: Task Group "ptc-redis" (failed to place 1 allocation):
      * Constraint "missing drivers": 1 nodes excluded by filter
    2023-04-14T17:33:54-04:00: Evaluation "15fd9887" waiting for additional capacity to place remainder
==> 2023-04-14T17:33:54-04:00: Monitoring deployment "f0c50e0c"
  ⠴ Deployment "f0c50e0c" in progress...

    2023-04-14T17:33:54-04:00
    ID          = f0c50e0c
    Job ID      = pytechco-redis
    Job Version = 0
    Status      = running
    Description = Deployment is running

    Deployed
    Task Group  Desired  Placed  Healthy  Unhealthy  Progress Deadline
    ptc-redis   1        0       0        0          N/A