I noticed some random errors on some applications hitting vault in the form of {"errors":["local node not active but active cluster node not found"]}
.
I managed to reproduced it locally by running
▶ curl --request POST --data '{"jwt": "<my_jwt_token>", "role": "<my_role>"}' https://my_vault_instance:8813/v1/auth/kubernetes/login
repeatedly.
Out of say 4000 requests I had 4 errors as the above. Although it seems rare, in a production environment environment with multiple entities making requests at vault this can be an issue.
I have a 1 instance vault deployed on GCE using the official module. I have also HA enabled (based on CPU usage). At no point in time is the machine cpu-stressed more than 50%.
What can be causing this?