Hello, i have an allocation that ends on the nomad cluster with the following logs
Jan 22 09:01:17 ip-192-xx-xx-115 nomad[406]: 2024-01-22T09:01:17.202Z [INFO] client.alloc_runner.task_runner: Task event: alloc_id=50ab3dd9-d2a6-e96d-be53-4a5ed7be6c4a task=frontend type=Terminated msg="Exit Code: 137, Signal: 9" failed=false
Jan 22 09:01:17 ip-192-xx-xx-115 nomad[406]: client.alloc_runner.task_runner: Task event: alloc_id=50ab3dd9-d2a6-e96d-be53-4a5ed7be6c4a task=frontend type=Terminated msg="Exit Code: 137, Signal: 9" failed=false
by checking the underlying logs i figure out that this come from an oom killed :
Jan 22 09:01:17 ip-192-xx-xx-115 kernel: Memory cgroup out of memory: Killed process 2577 (java) total-vm:8325660kB, anon-rss:4178744kB, file-rss:21016kB, shmem-rss:0kB, UID:65534 pgtables:9180kB oom_score_adj:0
but i don’t understand why i’m not able to see the metric nomad.client.allocs.oom_killed
on the metrics endpoint.
Did somesone can help me to understand the all process ?
How can i prevent such situation ?