So the past few days I’ve been noticing some awfully odd behaviour with Nomad; we have had entire clients crashing left right and center, and after a bit of poking it seems Nomad is apparently very content to just allocate all the things. For instance, we have a node with 128Gb memory, and if I search for that node in the topology view, it shows this: 175.01 GiB / 125.7 GiB
.
Also if we look at the allocations, it’s running 26 allocations (26 copies of the same job, 3 tasks in the job) total memory per allocation 7136Mb - now 26 times 7136Mb gives us something like 181Gb - which should not be happening; it should not have allocated this many allocations to this node because it physically won’t fit - now granted, the jobs do not use the entire 7136Mb which means so far we’ve been lucky none of them actually have, but we’re starting to see more and more OOM errors.
Nomad server is currently at 1.6.3 (yes, outdated, I know but we can’t update due to… reasons); clients are at 1.6.5 (same here, same reasons for running an older version).
Output from the scheduler config:
nomad operator scheduler get-config -region=xxxxxx
Scheduler Algorithm = spread
Memory Oversubscription = false
Reject Job Registration = false
Pause Eval Broker = false
Preemption System Scheduler = true
Preemption Service Scheduler = false
Preemption Batch Scheduler = false
Preemption SysBatch Scheduler = false
Modify Index = 13117658
So I guess the big question is: Why is Nomad ignoring the actual physical memory size, and allocating more than would actually fit the machine?