I had a vault cluster in a bad state the other day. The vault status reported that everything was fine, but when I would do a request, I would get 500 errors. Fortunately, the cluster is working again. Unfortunately I did not capture any of the errors or do more troubleshooting, because it was a production system.
Main question: I am wondering if there is a status endpoint that I could monitor for general 500 errors?
I wonder if
/sys/health would be the correct place…
The list of “default status codes” do not include the general 500 code: https://www.vaultproject.io/api/system/health.html
Side question: If there are “default status codes”, does that mean that custom codes can be enabled?