Jobs failing to pull images from ghcr.io

After upgrading to Nomad 1.9.0, I’m seeing jobs failing to pull images from ghcr.io.

    task "task" {
      driver = "docker"

      config {
        image = "ghcr.io/.../...:latest"
        args  = ["npm", "run", "task"]

        auth {
          username       = "...."
          password       = "...."
          server_address = "ghcr.io"
        }
      }
     ...

I’ve confirmed that the the credentials are good – I can successfully pull the image using the PAT on one of my nomad clients (tried using both root and ubuntu users).

Here’s what I see in my nomad logs:

Oct 21 19:57:08 ip-10-2-1-165 nomad[158446]:     2024-10-21T19:57:08.100Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=6ef1effb-5ec2-44e0-a371-b3a64d01badc task=task type=Received msg="Task received by client" failed=false
Oct 21 19:57:08 ip-10-2-1-165 nomad[158446]:     2024-10-21T19:57:08.108Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=6ef1effb-5ec2-44e0-a371-b3a64d01badc task=task type="Task Setup" msg="Building Task Directory" failed=false
Oct 21 19:57:08 ip-10-2-1-165 nomad[158446]:     2024-10-21T19:57:08.172Z [INFO]  agent: (runner) creating new runner (dry: false, once: false)
Oct 21 19:57:08 ip-10-2-1-165 nomad[158446]:     2024-10-21T19:57:08.172Z [INFO]  agent: (runner) creating watcher
Oct 21 19:57:08 ip-10-2-1-165 nomad[158446]:     2024-10-21T19:57:08.172Z [INFO]  agent: (runner) starting
Oct 21 19:57:08 ip-10-2-1-165 nomad[158446]:     2024-10-21T19:57:08.192Z [INFO]  agent: (runner) rendered "(dynamic)" => "/opt/nomad/data/alloc/6ef1effb-5ec2-44e0-a371-b3a64d01badc/task/local/env"
Oct 21 19:57:08 ip-10-2-1-165 nomad[158446]:     2024-10-21T19:57:08.204Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=6ef1effb-5ec2-44e0-a371-b3a64d01badc task=task type=Driver msg="Downloading image" failed=false
Oct 21 19:57:08 ip-10-2-1-165 nomad[158446]:     2024-10-21T19:57:08.252Z [ERROR] client.driver_mgr.docker: failed pulling container: driver=docker image_ref=ghcr.io/.../...:latest error="Error response from daemon: Head \"https://ghcr.io/v2/.../.../manifests/latest\": unauthorized"
Oct 21 19:57:08 ip-10-2-1-165 nomad[158446]:     2024-10-21T19:57:08.252Z [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=6ef1effb-5ec2-44e0-a371-b3a64d01badc task=task type="Driver Failure" msg="Failed to pull `ghcr.io/.../...:latest`: Error response from daemon: Head \"https://ghcr.io/v2/.../.../manifests/latest\": unauthorized" failed=false
Oct 21 19:57:08 ip-10-2-1-165 nomad[158446]:     2024-10-21T19:57:08.253Z [ERROR] client.alloc_runner.task_runner: running driver failed: alloc_id=6ef1effb-5ec2-44e0-a371-b3a64d01badc task=task error="Failed to pull `ghcr.io/.../...:latest`: Error response from daemon: Head \"https://ghcr.io/v2/.../.../manifests/latest\": unauthorized"
Oct 21 19:57:08 ip-10-2-1-165 nomad[158446]:     2024-10-21T19:57:08.253Z [INFO]  client.alloc_runner.task_runner: restarting task: alloc_id=6ef1effb-5ec2-44e0-a371-b3a64d01badc task=task reason="Restart within policy" delay=17.414633532s

The docker driver is working fine otherwise – public images are being pulled and run fine.

Any help would be much appreciated.

I just noticed this: docker: fix a bug where auth for private registries wasn't parsed correctly by pkazmierczak · Pull Request #24215 · hashicorp/nomad · GitHub. Now upgrading to v1.9.1

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.