Has anyone experienced an issue with Vault Helm chart version
0.9.1 and Vault version
1.6.2 via the image
vault:1.6.2 that matches the following description?
What seems to be happening is that at some point, the cluster master will encounter some sort of error, so it fails over to another Pod, and that Pod is immediately broken and shows a stream of errors like this:
error during forwarded RPC request: error="rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing remote error: tls: internal error""
TLS handshake error from 10.13.24.148:41014: remote error: tls: bad certificate
This is completely random and we are baffled. It has been happening since April and we have avoided rolling a newer version out to production because this would be terrible were it to happen there. I’m happy to provide more logs or more information, but the TLDR is that the agent injector Pods are unable to reach Vault and our apps are broken since we use the init feature to source secrets.
From there, Vault Pods themselves just ceaselessly complain about TLS until the broken Pod is deleted and replaced, and everything is fine again for between 1 week and 3 months.
This is running in GKE if that helps.