Nomad not rescheduling allocations due to high usage on one node

sesfre · February 17, 2021, 8:07am

Hi everyone. I have three nomad clients+servers, running nomad 0.12.3 and 0.11.2.
Everything is running fine, however Nomad does not seem to balance the allocations across the clients. I am seeing that two clients have their RAM usage at about 70% while the other always sits at 23%. And the jobs are not all system jobs or bound to a client, they can be shifted freely.
I now added another job of the system type. Nomad said that one allocation can not be placed due to:

Resources exhausted on 1 node
Dimension memory exhausted on 1 node

The node in question is one of the 70% ram usage clients. However first of all I don’t understand what exactly is meant by “resources exhausted” and “dimension memory exhausted”, which resource specifically? RAM? Disk space? Because unique.storage.bytesfree says I have 50GB free and my job needs about 300MB, which should not cause problems.

The other thing is, why does Nomad not reschedule existing jobs so the job can be started on this client? There are plenty of possibilities.
This is the reschedule policy on all jobs:

“ReschedulePolicy”: {
“Attempts”: 0,
“Interval”: 0,
“Delay”: 30000000000,
“DelayFunction”: “exponential”,
“MaxDelay”: 3600000000000,
“Unlimited”: true
},

jrasell · February 17, 2021, 12:05pm

Hi @sesfre and thanks for the questions, I’ll try and answer each one of them below.

Nomad does not seem to balance the allocations across the clients

The Nomad service scheduler uses a binpacking algorithm by default which would explain the resource differences between the clients you are experiencing. You can modify this behaviour to use spread as documented within the agent configuration scheduler config section.

exactly is meant by “resources exhausted” and “dimension memory exhausted”, which resource specifically

The memory exhaustion here refers to the resources as defined within the job specification task resource stanza. Nomad fingerprints clients to understand the available resources, CPU and Memory most importantly, and maintains state of what resources have been allocated to workload. This allocated resource value differs from the actual resource usage of the underlying host.

To put this into an example:

The Nomad cluster contains a single client that has 100 MHz CPU and 100 MB memory available
A user runs a job that requests 60 MHz CPU and 70 MB memory
The Nomad client now has 40 MHz CPU and 30 MB memory available for scheduling of new jobs
If the user attempts to run another job, requesting 40 MHz CPU and 40 MB memory, the same error you have seen will be received

As I said previously, it’s important to note the allocation of resource is separate to the actual resource usage on the underlying host.

why does Nomad not reschedule existing jobs

If I understand this correctly, I believe this would relate to preemption rather than the rescheduling functionality.

I hope this helps. Please let me know if you have any follow-up questions.

Thanks,
jrasell and the Nomad team

sesfre · March 8, 2021, 9:38am

Thanks, that makes it more clear now. I found the metrics to check the availability in the future.

Topic		Replies	Views
Nomad not rescheduling system jobs on nodes that previously ran out of disk space Nomad	2	298	July 7, 2022
Nomad allocations placement Nomad nomad	2	250	March 13, 2024
Strange problem with Nomad allocating jobs that use up more memory than the machine actually has Nomad	0	28	March 20, 2025
How does Nomad schedule jobs? Nomad	1	323	September 10, 2023
Client memory reservation is not used for job allocation Nomad	1	196	October 30, 2023

Nomad not rescheduling allocations due to high usage on one node

Related topics