Vault HA cluster monitoring with Prometheus

artur.r · May 26, 2023, 8:38am

Hello!

We have a HA Vault cluster with raft storage, which was deployed in K8s with Helm. We configured telemetry (with prometheus_retention_time=24h) for Prometheus scraping, but it seems like missing the telemetry core.* metrics on standby nodes.

I’ve found a similar issue that was opened in 2020 (!) year.

Are we having any workaround solutions now? How can we monitor Vault cluster health with Prometheus?

maxb · May 26, 2023, 9:25am

Vault’s implementation of Prometheus metrics has quite a few bugs or suboptimal design issues, unfortunately.

Expect to need to set up a separate custom exporter, or work around quite a lot of issues in PromQL.

I was interested in fixing this in the past, but I no longer work at an employer that uses Vault, and HashiCorp’s rate of accepting community contributions is too slow for it to be enjoyable as a hobby.

Topic		Replies	Views
Monitoring Vault with Prometheus (missing data?) Vault k8s , vault , prometheus	1	1485	February 23, 2021
Vault metrics serverd only by one node out of 3 Vault	1	397	April 6, 2022
Prometheus metrics seem incomplete Vault prometheus	3	1095	December 1, 2021
Vault 1.9.1 KV metrics Vault k8s	1	384	January 15, 2021
Prometheus Metrics Are Empty Vault prometheus	2	1783	November 24, 2021

Vault HA cluster monitoring with Prometheus

Related topics