What is the best way to Configure Nomad to Pull from ECR private registries?

Hi, struggling here to get Nomad pull Images from a ECR private registry. Is there a recommended way to do so?

where is the target worker node? AWS? or on premise (i.e. outside of AWS)?

The worker nodes are in AWS

The way I would suggest is to do the following (some things can be tweaked):

  • attach IAM role/policy to the EC2 instance to give it ECR read permissions.
  • a cron job runs the aws ecr get-login ... command every 11 hours (12 hour timeout of the token)
  • configure the Nomad agent config with the docker-credential-helper bit to pick up the credentials seeded by the cron job.
plugin "docker" {
  config {
    auth {
      config = "/root/.docker/config.json"
      # Nomad will prepend "docker-credential-" to the helper value and call
      # that script name.
      helper = "ecr-login"
    }
...

NOTE: The above is what I have “cobbled” together, but I am sure there is a more “secure” way of doing this, that is, the ECR login automatically is done only during the image pull, but I was not able to get that configured.

The ecr command could be something like …

aws ecr get-login-password --region <whatever_region> | docker login --username AWS --password-stdin <url_of_ecr_repo>

I have shoe-horned the following system job to make it behave like a system + cron … (its a hack, but works) :grinning_face_with_smiling_eyes:


# vim: tabstop=4 expandtab shiftwidth=4 softtabstop=4

job "aws-ecr-login" {

  type = "system"

  datacenters = ["dc1"]

  constraint {
    attribute = "${attr.kernel.name}"
    value     = "linux"
  }

  group "mygroup" {

    # restart block needed for 'system' job to ensure it stays running
    restart {
      mode     = "delay"
      interval = "30m"
      attempts = 20
      delay    = "1m"
    }

    task "mytask" {
      driver = "raw_exec"

      template {
        data = <<__END_OF_DATA__
#!/bin/bash

set -u

exec 2>&1

echo "#####"
hostname
date
which aws
echo "#####"

delay=$(( 11 * 3600 ))

while (( 1 )); do
    date
    aws ecr get-login-password --region <aws_region_here> | docker login --username AWS --password-stdin <aws_repo_here>

    echo "sleeping [${delay}] seconds ..."
    sleep ${delay}
done

exit 0

__END_OF_DATA__

        destination = "local/runme.bash"
      }

      config {
        command = "/bin/bash"
        args    = ["local/runme.bash"]
      }

      resources {
        cpu    = 100
        memory = 100
      }

      env {
        AWS_DEFAULT_REGION = "<aws_region_here>"
      }

      service {
        name = "aws-ecr-login"
        tags = ["aws-ecr-login"]
      }

    } # task
  }   # group
}     # job

HTH. :slight_smile:

Thank you shantanugadgil !

I think the best way would be to:

  • Install amazon-ecr-credentials-helper on host

  • Attach IAM role that allows push from your repo to instance (if on AWS) or supply API credentials via envars.

  • Config Nomad daemon as explained in the docs:

    plugin "docker" {
       auth {
         config = "/etc/docker-auth.json"
       }
     }
    

    /etc/docker-auth.json content:

    {
      "credHelpers": {
        "<acct>.dkr.ecr.<region>.amazonaws.com": "ecr-login"
    }
    

This way you don’t need to configure every job with auth related stuff.

Note: /etc/docker-auth.json is arbitrary and can be anywhere as long as the daemon has access to it.

considering the scenario is on AWS, the IAM role approach would be the _only_recommended rather than key pair in env.

for on prem, I have a cred “refresher” job as mentioned above which fetches static secrets from Vault and keeps the login “alive”. :slight_smile:

1 Like

Thank you both for those tips!