Getting to know the Nomad Autoscaler

I’ve recently started trying out the Nomad autoscaler to see if it would be useful for my organisation to use in production. I have got the autoscaler docker plugin running on my admiral machine, and I am able to manually scale up a job to have more instances through the UI now I have added a scaling stanza to it’s group stanza, but I am unsure how I can automatically scale the application (assuming my server has capacity to run more instances of the application).
I have config for the autoscaler which can scale the cluster, but while I have seen in the readme where it says:

In Nomad, horizontal application autoscaling can be achieved by modifying the number of allocations in a task group based on the value of a relevant metric, such as CPU and memory utilization or number of open connections.

I don’t see how I can do this. Is scaling the application as opposed to the cluster documented anywhere?
I have looked at the files for the two demos, but I don’t see in there clearly how the scaling of the application is separated from the scaling of the cluster. Can someone clarify where this information is held or how this works please?

Hi @Rumbles :wave:

The Nomad Autoscaler is capable of scaling applications by adding a scaling block in your group. You can find the documentation here: https://www.nomadproject.io/docs/job-specification/scaling

We also have a Vagrant-based demo for application scaling here: https://github.com/hashicorp/nomad-autoscaler/tree/master/demo/vagrant

In that demo there’s an example job with a scaling block that you can use as a reference.

I hope this helps!

1 Like

Thanks for the quick response, but that didn’t clarify my question. If you have a scaling block, how do you differentiate between application and cluster scaling?

Short answer:
As a rule of thumb, policies that are placed in the job file are for application scaling, while policies that are read from a file are for cluster scaling, but this may not always be true.

Longer answer:
The difference is defined by the target value for the policy.

For example, a cluster scaling policy would look something like this (full example):

min = 1
max = 2

policy {
  cooldown            = "2m"
  evaluation_interval = "1m"

  target "aws-asg" {
    dry-run             = "false"
    aws_asg_name        = "hashistack-nomad_client"
    node_class          = "hashistack"
    node_drain_deadline = "5m"
  }

  check "cpu_allocated_percentage" {
    source = "prometheus"
    query  = "..."
    strategy "target-value" {
      target = 70
    }
  }
}

You can see that this policy is targeting an AWS Autoscaling Group since its target is aws-asg, so it will perform cluster scaling. Cluster scaling policies are usually stored in files and loaded by the Autoscaler using the plugin.dir config.

Application scaling policies use the Nomad Task Group target, but this is done automatically for you when you place the scaling block in your job file.

The example I linked in my previous message is equivalent to this:

enabled = false
min     = 1
max     = 20

policy {
  cooldown = "20s"

  target "nomad" {
    Job   = "example"
    Group = "cache"
  }

  check "avg_instance_sessions" {
    source   = "prometheus"
    query    = "..."

    strategy "target-value" {
      target = 5
    }
  }
}

Nomad will just create that target section for you based on your job file.

Thanks for that clarification. That helps me understand this a bit better!

Just some further questions :slight_smile: Can a scaling policy that scales the cluster also scale up the application? Or are they mutually exclusive?

If there is capacity for another copy of the application on the running server, will the policy which includes config to scale an aws-asg add another copy of the application on the running machine, it will it spin up another machine?

If you have added the target aws-asg block, does it overwrite the target nomad block you mention exists by default, or can the config have multiple targets?

Can a scaling policy that scales the cluster also scale up the application? Or are they mutually exclusive?

Each policy can only have one target, so it will either scale the cluster or the app. We haven’t really thought about this because the query usually returns metrics for a specific target (like memory available in the cluster, or request latency of an app), but I am curious to learn about the use case you have in mind :slight_smile:

If there is capacity for another copy of the application on the running server, will the policy which includes config to scale an aws-asg add another copy of the application on the running machine, it will it spin up another machine?

If I understand this correctly, there are actually two things going on here, and that’s why each policy can only have on target.

Let’s say you have an application policy like this (simplified for brevity):

policy {
  check "avg_sessions" {
    source = "prometheus"
    query  = "open_connections / nomad_nomad_job_summary_running"

    strategy "target-value" {
      target = 10
    }
}

This policies will make sure that you have and average of 10 open connections per application instance. Now let’s imagine that your app is trending on Twitter (:tada:) , and your metric jumps to 200 connections per instance (:scream:). The Autoscaler will update your job to add 20x new instances.

But your cluster can’t handle this many instances, so some allocations will be stuck pending, waiting for new resources.

Now let’s imagine that you also have a cluster policy like this (also simplified for brevity):

policy {
  check "mem_allocated_percentage" {
    source = "prometheus"
    query  = "100 * nomad_client_allocated_memory/(nomad_client_unallocated_memory + nomad_client_allocated_memory)"

    strategy "target-value" {
      target = 70
    }
  }

  target "aws-asg" {
    aws_asg_name = "hashistack-nomad_client"
  }
}

This policy will look at the percentage of used memory in the cluster and scale up (add nodes) when more than 70% of memory is used.

With 20x more app instances trying to run you will certainly run out of memory. The Autoscaler will detect this because of the cluster policy above and add new nodes to meet the demand.

Once these new nodes are up, the pending allocations will be scheduled into them just like Nomad would normally do.

As you can see, two scaling events happened: both the app and the cluster scaled up. But they did so for different reasons and independently from each other: the app had too many connections to handle and the cluster ran out of memory.

So if you want multiple things to happen you will most likely need multiple policies.

The only side note here is for system jobs, which Nomad will automatically schedule so that there’s one instance running per node.

Thanks for clarifying that a policy should only be used for scaling either the application or the cluster, but I think that makes this feature not fit for purpose in most cases where I would want to use it.

You mention in your reply that you would have one policy that would scale the cluster and second policy that would scale the application, but as far as I can tell, that is not possible. You cannot have more than one policy in a scaling stanza and you can only have one scaling stanza per job definition. How would you have one policy to scale the cluster and a second that would scale the application in the same job? I currently have to use Hashicorp levant to create job definitions as we use it for templating and when I try to upload job with multiple policy blocks I get an error like:

Error getting job struct: Error parsing job file from /tmp/job.nomad: error parsing 'job': group: scaling -> only one 'policy' block allowed per 'scaling' block

So how is what you’ve described possible?

In my situation I have 3 applications which take jobs from kafka queues (one queue for each application). I have a metric in prometheus which tells us how old the messages in the queues are. The machines that run these applications are quite large, they have 16GB of RAM and we have told Nomad that each application needs 4GB of RAM to run. So in theory we can have 4 instances of each application running on each machine, but normally we would just have one copy of each application.

If the queue for one application starts to grow we want to add more copies of the application, but if the server is at capacity, we also want to be able to add more servers to the group to consume more messages from the queue.

It seems that I would only be able to scale the application to have one copy per server, since you set the max/min in the scaling block and this is used for both the number of applications running and the number of servers in the cluster, you cannot have a case where you have one of the three applications able to scale to use use up the remaining capacity on the first server, then once at capacity on that server adding more servers and more copies of that application to the new servers until the queue for that application starts to decrease.

How would you have one policy to scale the cluster and a second that would scale the application in the same job?

You wouldn’t. The cluster scaling policy should be placed in a separate file. For example, you could have a file structure like this:

.
├── nomad-autoscaler
├── jobs
│   ├── job1.nomad
│   ├── job2.nomad
│   └── job3.nomad
└── policies
    └── aws-asg-cluster.hcl

The jobs would have scaling blocks with their own specific app scaling policies, while the policies/aws-asg-cluster.hcl file would have the cluster policy. You can then start the Autoscaler passing that folder:

$ nomad-autoscaler agent -policy-dir=./policies

If you are running the Autoscaler as a Nomad job, you can create this file structure using templates for the cluster policies.

In my situation I have 3 applications which take jobs from kafka queues (one queue for each application). I have a metric in prometheus which tells us how old the messages in the queues are.

In each of these jobs, you would add a scaling block that would look like this:

job "job1" {
  group "consumer" {
    ...
    scaling {
      min = 1
      max = 10

      policy {
        check "queue_age" {
          source   = "prometheus"
          query    = "kafka_oldest_message{queue=\"queue_1\"}"

          strategy "target-value" {
              target = 10
          }
        }
      }
    }
  }
}

In this example, kafka_oldest_message would return how many hours the oldest job has been sitting in the queue. If it’s less than 10h only 1 worker would be running (min = 1). If it’s more than 10h, the Autoscaler would create workers proportionally to the metric, so if the metric is 30 the Autoscaler would set 3 workers for that queue. You will need to adapt the query to your specific scenario, but I hope this helps get an idea.

For cluster scaling, you would create a new file (not a job file) in a policies folder. The file content will be something like this:

min = 1
max = 5

policy {
  check "mem_allocated_percentage" {
    source = "prometheus"
    query  = "sum(nomad_client_allocated_memory*100/(nomad_client_unallocated_memory+nomad_client_allocated_memory))/count(nomad_client_allocated_memory)"
    strategy "target-value" {
      target = 70
    }
  }

  target "aws-asg" {
    aws_asg_name = "hashistack-nomad_client"
    node_class   = "hashistack"
  }
}

So this is a different file, and it’s not a job file. It will be read by the Autoscaler when it starts if you point it to the policies directory:

$ nomad-autoscaler agent --policy-dir=./policies

It seems that I would only be able to scale the application to have one copy per server, since you set the max/min in the scaling block and this is used for both the number of applications running and the number of servers in the cluster

min and max values are independent on each policy. In the examples above, the number of nodes will range from 1 to 5 and number of applications instance will range from 1 to 10.

1 Like

Ah that makes sense! I misunderstood the way you were meant to do it, that really helps, thanks!

1 Like