Clarifications on nomad metrics values

hi!

what is the difference between “system space” and “user space”, specifically metric

The difference is as on linux. See like User CPU time vs System CPU time? - Stack Overflow . Reseach linux cpu usage metrics.

but all other allocs.cpu metrics are in Percentage

What about nomad.client.allocs.cpu.total_ticks?

How can I build an alarm that triggers when allocation CPU usage crosses available allocation CPU?

I use prometheus, when the following is greater than 100%:

nomad_client_allocs_cpu_total_ticks{namespace=~"$namespace",instance=~"$client",exported_job=~${job:doublequote},task_group=~"$group",task=~"$task",alloc_id=~"$alloc_id"} * 100
/
nomad_client_allocs_cpu_allocated{namespace=~"$namespace",instance=~"$client",exported_job=~${job:doublequote},task_group=~"$group",task=~"$task",alloc_id=~"$alloc_id"}

Why is it like that? What am I missing?

See Linux CPU usage metrics. This is nothing specific to Nomad. See man proc, see /proc/stat documentation.

it also doesn’t add up really.

What about the kernel? What about I/O device buffers? Consider researching Linux memory.

What metric can I use to see nomad internal client/host processes CPU/memory usage?

I do not understand the question, what is “internal client” and “internal host” processes, and how do they differ from “external”? You might be interested in Zabbix or prometheus or nagios.

To monitor go process “internal” (i.e. metrics package - runtime/metrics - Go Packages) of the Nomad process itself, I use nomad_runtime_alloc_bytes and nomad_runtime_heap_objects.