Prometheus node_exporter filesystem metrics and Nomad CSI mounts


TLDR: I would like to measure disk/filesystem usage of my Nomad CSI volumes, but it doesn’t work with Prometheus’ node_exporter due to very tight permissions on CSI mount points.

I’m currently running a Nomad cluster using Nomad Volumes via ceph-csi. Most of these are RBDs, namely block devices. Without mounting them, the total storage utilization cannot be determined.

So what I wanted to do to keep an eye on my volume’s remaining free storage, I thought to just use Prometheus’ node_exporter, which I already had running on all of the Nomad cluster’s nodes.

While the resulting metrics did contain filesystem collector values for e.g. the node’s root filesystem, there was no data for any of the CSI volumes. The reason for that seems to be based on the permissions of the node_exporter user (just a normal system user) and the Nomad data directory into which the CSI volumes are mounted (only accessible to root, with every directory level down to the actual mount points having 700 permissions).

On one of my nodes, I’m seeing node_exporter errors like this:

Jun 19 19:28:27 experinode prometheus-node-exporter[37383]: level=debug ts=2022-06-19T17:28:27.764Z caller=filesystem_linux.go:95 collector=filesystem msg="Error on statfs() system call" rootfs=/srv/nomad/data/client/csi/node/ceph-csi-rbd/staging/vol-postgres-db/rw-file-system-single-node-writer/0001-0024-a84c7196-7ebf-11eb-b290-18c04d00217f-0000000000000002-c3cd1dc3-8084-11ec-8bf6-0242ac110004 err="permission denied"

The permissions for that directory look like this:

ls -l /srv/nomad/data/client/csi/node/ceph-csi-rbd/per-alloc/62a196ea-8cb3-fd51-618e-55d9072beaf9/vol-postgres-db
total 4
drwx------ 19 88 88 4096 May  2 17:12 rw-file-system-single-node-writer

(88 here is the user for the Postgres container)

Two possible solutions:

  • Run the node_exporter as root
  • Somehow configure all Nomad CSI mounts to have 755 permissions

Neither of these solutions seem to be good ideas, particularly because the only goad I have is to get at the free/used space on my CSI volumes.

My question: Is there any way to give a particular user extended “df”/“stat” permissions, without any other permissions?
How are you all keeping an eye on free space on your Nomad CSI volumes, especially when they are provided by Ceph?