Is there any way to calculate the cost of Nomad resources (per namespace, per job, etc.)?

barrosoguillermo55 · November 8, 2023, 12:03pm

We need to have something that allows us to monitor each namespace and job and estimate their costs based on our criteria.

Is there a way to do this? I know there are projects like opencost, which may be suitable for this request. Maybe we would have to change a little bit of code to be usable in Nomad.

Thanks in advance

Kamilcuk · November 9, 2023, 7:04am

I think you mean “resource allocation utilization”, and you want to monitor current resource utilization in time split into groups.

Observing Prometheus Nomad client metrics, you can plot graphs of allocations resource usages. Take nomad_clients_allocs_cpu_allocated metric and pluck it in Grafana, and you’re done. There should be even examples online. Using Prometheus to Monitor Nomad Metrics | Nomad | HashiCorp Developer , Monitoring Nomad | Nomad | HashiCorp Developer , Dashboards | Grafana Labs

I took one job and drawed this for you: obraz

benvanstaveren · November 22, 2023, 2:18am

Since it seems you can set the criteria, all you need to do is enable Prometheus metrics, and apply your criteria to the metric in question.

This is pretty much what we do at $work, but we only look at memory; and basically calculate it along the lines of “cost of a physical node divided by Gb of memory that node has” - this gets us a cost-per-gb for that particular node. We do keep a rolling “average” for this cost per gb across our various nodes, and add a (small) margin on top - this is then used to calculate the cost of a job via said Prometheus metrics. Not 100% accurate (although it could be with some judicious use of node meta info, prometheus metrics, and a few glue scripts) but suffices for our needs.

For CPU you could do something similar since the node info panel shows you the number of compute slices a node provides to the cluster, but then you get into the whole split cpu/memory cost 50/50 across a node, or weigh it differently. Hence, we look at one metric (the one that matters most to us at any rate) to keep things simple.

mrchrisadams · December 6, 2023, 10:41am

Hi folks I’d be curious about this too, as there’s now an issue in the opencost repo called OpenCost without Kubernetes:

They also have a spec that presumably you could compare how Nomad sees things compared to how k8s sees things, to design some kind of meaningful mappings:

More below

Topic		Replies	Views
Nomad client allocation memory stats from telemetry seems confusing Nomad consul-nomad	3	1101	September 22, 2023
Clarifications on nomad metrics values Nomad	4	308	April 23, 2024
Autoscaler and bounds nop scaling Nomad	0	151	July 31, 2023
Official grafana dashboard Nomad	2	447	April 8, 2024
PromQL queries with telemetry Nomad prometheus	0	25	November 19, 2024

Is there any way to calculate the cost of Nomad resources (per namespace, per job, etc.)?

Related topics