Using 1 Nomad server (4GB memory) and 2 Nomad clients (2GB memory each) in a Vagrant VirtualBox cluster (ubuntu/focal64 boxes), I’m experimenting with running a batch job with 50,000 alpine echo tasks. The cluster makes steady progress until the total number of jobs completed hits about 4,000 or so. After that, progress grinds to a halt, with each Nomad client process getting killed by the oom_reaper and restarting.
I was hopeful that pull request #9093 would help here, since the problem it was reported to solve sounded very similar. However, v0.12.6 should have that PR, and it did not help with this issue.
Is there some additional management that has to be done to complete a large batch like this, or does this look like a bug or limitation in Nomad?
The oom_reaper logs end like this:
Oct 22 01:26:14 client-two kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/nomad-client.service,task=nomad,pid=214571,uid=0
Oct 22 01:26:14 client-two kernel: Out of memory: Killed process 214571 (nomad) total-vm:1894584kB, anon-rss:499036kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:1372kB oom_score_adj:0
Oct 22 01:26:14 client-two kernel: oom_reaper: reaped process 214571 (nomad), now anon-rss:48kB, file-rss:0kB, shmem-rss:0kB
Oct 22 01:26:14 client-two sh[214570]: Killed
Oct 22 01:26:14 client-two systemd[1]: nomad-client.service: Main process exited, code=exited, status=137/n/a
Oct 22 01:26:14 client-two systemd[1]: nomad-client.service: Failed with result 'exit-code'.
This is the job file:
# alpineBatch.hcl
job "alpineBatch-50K" {
datacenters = ["dc1"]
type = "batch"
group "alpines" {
count = 50000
volume "data" {
type = "host"
read_only = false
source = "alpine"
}
task "alpineExample" {
driver = "docker"
config {
image = "alpine:3"
command = "sh"
args = [ "-c", "echo `adjtimex | awk '/(time.tv_sec|time.tv_usec)/ { printf(\"%06d\", $2) }'` ${node.unique.name} ${env["NOMAD_ALLOC_INDEX"]} >> /data/alpineExample.log" ]
}
resources {
memory = 64
}
volume_mount {
volume = "data"
destination = "/data"
read_only = false
}
}
}
}