How to scale a job automatically to fill capacity?

I would like always to be running a service job on my cluster, similar to Folding@Home, that scales up to make use of all available resources. I want it to be preempted when I run higher priority jobs then scale up to fill the remaining capacity again when the higher priority jobs finish.

How can I do this with Nomad?

Hi @Bedouin :wave:

Is this a single job that that would like to increase or decrease its resource values, or several jobs where you would modify the count number?

Hi!

I want to modify the count number automatically while keeping the resources values constant.

Thanks! For this you will need to run the Nomad Autoscaler. You also need to have something like Prometheus running as well in order to collect you cluster metrics.

Take a look at this guide to get a quick start on how to use the Autoscaler.

Once you are more familiar with how it works, you will need to add a scaling policy to your job. You will also want to reduce the job’s priority so other jobs are able to preempt it.

Creating your policy query might be a bit challenging. Here’s the list of metrics that Nomad emits: Metrics | Nomad by HashiCorp

I think that, in your case, you would want something like

sum(nomad_client_unallocated_memory) / <AMOUNT OF MEMORY RESERVED FOR YOUR TASK>

All things considered, your job would look something like this:

job "folding-at-home" {
  # ...
  priority = 10  # Defaults to 50.
  # ...
  group "folding-at-home" {
    # ...
    scaling {
      min = 0
      max = 10  # Adjust as necessary.
      policy {
        check "resources_available" {
          source = "prometheus"
          query  = "sum(nomad_client_unallocated_memory)/512"

          strategy "pass-through" {}
        }
      }
    }

    task "folding-at-home" {
      # ...
      resources {
        memory = 512
      }
      # ...
    }
  }
}

I know that’s a lot to take in, but let me know if anything was not clear :sweat_smile:

Looks like I’ll use a combination of unallocated.cpu, unallocated.disk, and unallocated_memory. Thank you!

1 Like

Yes, exactly. Sorry I forgot to mention the other resources :upside_down_face: