How to control/deploy nomad jobs and variables?

teymour · March 30, 2023, 12:35pm

Hey all,

I am starting to work on a nomad cluster for our services (trying to keep it minimal by not using vault or consul in the first iteration).

Currently, am researching what the best way would be to control/deploy nomad jobs and variables.

My first approach was:

a jobs folder containing all job files
a variables folder containing all variables files
use ansible to upload these files to one of the nodes
then use ansible to iterate through these and execute nomad job run ... / nomad var put -force @... on that node

The ansible code for that is pretty minimal and it works fine. It also gives me the added benefit of using ansible-vault to encrypt the variables file in case these contain sensitive data.

But I just did a try with the terraform nomad provider to see how it compares. Note that I am aware that this provider does not yet support nomad variables (I saw a MR was created for it).

I used following simple approach (we enabled ACL’s):

variable "nomad_token" {
  sensitive = true
}

provider "nomad" {
  address = "http://<some_host>:4646"

  secret_id = var.nomad_token
}

variable "jobs_folder" {
  type = string
}

resource "nomad_job" "job" {
  for_each = fileset(var.jobs_folder, "**/*.nomad")

  jobspec = file("${var.jobs_folder}/${each.value}")

  purge_on_destroy = true

  hcl2 {
    enabled  = true
    allow_fs = true

    vars = {
      nomad_token = var.nomad_token
    }
  }
}

I like the fact that using the teraform provider gives a state, which makes removing/purging jobs (by renaming/deleting job files) much easier. On top of that it gives the benefit of managing other resources like volumes/acl’s etc…

It is also much faster (as it doesn’t have the ssh/ansible connection overhead).

So here my first question:

Some of our jobs will need the nomad ACL token (var.nomad_token). For example the traefik job needs it for service auto-discovery via nomad API.

By using the approach with the code above, it seems I need to declare the nomad_token as input variable in every job. Is there any way to avoid that? Assuming I would like to keep it generic and treat every job file the same way.

My second question would be:

How are others managing the deployment of jobs in a production environment? I haven’t found many resources regarding that (or missed them).

Thanks in advance!
~

hector.medina.cabane · April 12, 2023, 3:46pm

First question.

I don’t think Traefik needs any token for auto-discovery. Traefik reads the docker tags from docker-daemon. That’s it.

You have to provide a nomad token to apply a job itself by the CLI or HTTP API. You don’t need to declare that in the job file.

Second question.

We are using GitHub Actions for that with a CICD pipeline using the Nomad HTTP API for deploying new versions of the jobs.

I hope I have helped you some how, it’s a long post (haha). Let me know if you need something else.

teymour · May 2, 2023, 12:37pm

Thanks for the answer!

Though I think passing the nomad token to the traefik nomad provider is necessary if you use ACL’s (just tested it by removing the token from the traefik config). See here. And I’d prefer to keep ACL’s enabled - even if the UI is sitting behind a bastion.

For now I decided to add the nomad token as a nomad variable. Still unsure about that as it sounds weird/might pose a security risk. But we will start with only 1 acl role/token (small team).

traefik task:

    task "traefik_http" {                                                                                              
      driver = "docker"                                                                                                
                                                                                                                       
      resources {                                                                                                      
        cpu    = 100 # MHz                                                                                             
        memory = 64 # MB                                                                                               
      }                                                                                                                
                                                                                                                       
      volume_mount {                                                                                                   
        volume      = "traefik-cert-storage"                                                                           
        destination = "/etc/traefik/acme"                                                                              
        read_only   = false                                                                                            
      }                                                                                                                
                                                                                                                       
      template {                                                                                                       
        destination = "${NOMAD_TASK_DIR}/traefik.yml"                                                                  
        change_mode = "restart"                                                                                        
        data        =  file("./configs/traefik_http.yml.tpl")                                                          
      }                                                                                                                
                                                                                                                       
      config {                                                                                                         
        image = "traefik:3.0"                                                                                          
        ports = ["traefik_admin", "https", "http"]                                                                                                                                                                                            
        args  = ["--configFile=${NOMAD_TASK_DIR}/traefik.yml"]                                                         
      }                                                                                                                
    }

(part of the) traefik config:

providers:
  nomad:
    constraints: "Tag(`traefik=http`)"
    endpoint:
      address: http://{{ env "attr.unique.network.ip-address" }}:4646
      {{ with nomadVar "nomad/jobs" -}}                                                                                                                                                                                                       
      token: {{ .nomad_token }}
      {{- end }}

sample service using traefik discovery

    service {
      name     = "test-service-1"
      port     = "httpd"
      provider = "nomad"

      tags = [
        "traefik=http",
        "traefik.enable=true",
        "traefik.http.routers.service-1.rule=Host(`service-1.some.domain`)", 
        "traefik.http.routers.service-1.tls=true", 
        "traefik.http.routers.service-1.tls.certresolver=letsencrypt",
        "traefik.http.routers.service-1.entrypoints=web,websecure"
      ] 
    }

teymour · May 2, 2023, 12:44pm

Also, regarding on how to manage nomad jobs/variables, I opted to fully go the terraform way as I really like having a state that allows to track what needs to be created/destroyed.

Since variables are not supported by the terraform nomad provider yet, I went with following workaround (it still needs a solution to encrypt the nomad variable files):

resource "null_resource" "nomad_var" {                                                                                 
  for_each = fileset(var.variables_folder, "**/*.nv.hcl")                                                              
                                                                                                                       
  triggers = {                                                                                                         
    variable_file = filesha1("${var.variables_folder}/${each.value}")                                                  
  }                                                                                                                    
                                                                                                                       
  connection {                                                                                                         
    type  = "ssh"                                                                                                      
    user  = "root"                                                                                                     
    host  = var.nomad_random_node_ip                                                                                   
                                                                                                                       
    bastion_host = var.bastion_ip                                                                                      
    bastion_user = "bastion"                                                                                           
  }                                                                                                                    
                                                                                                                       
  provisioner "remote-exec" {                                                                                          
    inline = [                                                                                                         
      "mkdir -p ${local.remote_var_folder}"                                                                            
    ]
  }

  # copies over the variable file
  provisioner "file" {
    source      = "${var.variables_folder}/${each.value}"
    destination = "${local.remote_var_folder}/${each.value}"
  }

  provisioner "remote-exec" {
    inline = [
      "NOMAD_TOKEN=${var.nomad_token} nomad var put -force @\"${local.remote_var_folder}/${each.value}\""
    ]
  }
}

Topic		Replies	Views
Passing Nomad ACL token to traefik job via job specification Nomad nomad	2	348	January 10, 2024
Nomad Vars and usage and Audit Nomad	6	628	August 16, 2023
Define variables across job definitions Nomad	3	1166	June 16, 2021
Problem parsing variables using template Nomad	1	280	November 9, 2023
How to manage secrets only using nomad? Nomad	3	447	March 23, 2021

How to control/deploy nomad jobs and variables?

Related topics