We’re running managed Vault on Hashicorp Cloud on a Standard Small cluster. We’ve started recently seeing errors returned to our client: HTTP 412 “required index state not present”. We’ve noticed that these errors appear to be highly clustered. We don’t see any errors for hours or days, then within a 5 minute period we see thousands.
Some Googling has told us that this is related to Vault’s eventual consistency model – if a request hits a reader node but the Vault state hasn’t been replicated from the writer to the reader, this error is returned.
Two points of confusion with this for us:
My understanding is that this consistency check only occurs when the
X-Vault-Indexheader is passed alongside a read request that depends on some state being present. This value is set to the
X-Vault-Indexheader returned from a previous Vault write request, indicating that that state version isn’t present yet. We use
node-vault, and from my reading of the code, there is no automatic propagation of that X-Vault-Index header from response to subsequent request, so I’m not sure where that value would be coming from.
We aren’t currently using Vault as a secret store, we’re using Transit Engine as an encryption provider. We’re not regularly mutating Vault state, we’re simply asking Vault to encrypt some plaintext using a preexisting key, or asking Vault to decrypt some ciphertext using that preexisting key.
I suspect that this might have something to do with client auth (we’re using AppRole), since these client errors seem to be highly clustered together, and clustered around specific client hosts. Our theory is that when a client token expires, we refresh that token, and that introduces inconsistency with our Vault cluster.
I’m wondering what the best way to resolve this is. Should we block all other Vault operations client-side until we can validate that the client token has been properly refreshed?