Response wrapping and horizontal scaling

Dear all,

Is there any recommendation for response wrapping token when the production application is eligible for autoscaling ? Autoscaling will result in new deployments but since the response exchange token is a short time one time token I cannot see how additional deployment can success. Any best practices ?

Thanks !

I just want to say up front that I cannot recommend auto-scaling Vault, not for performance reasons anyway. You can, and should use it for rolling upgrades and simple instance instability.

That said, really the two have no connection other than one can trigger the other. You cannot expect that a connection can trigger an auto-scale and that the new instance will respond to the request. A request will tip the scales of response time based on whatever you defined, a new instance will be instantiated and start, join the raft (or directly become a perf. standby). It may or may not trigger an election at that point.

Thanks for your answer, but actually I was writing about the autoscaling of the application, not vault itself. Sorry if it was not clear.

If autoscaling and/or high availability is a requirement, the application is deployed more than once in the same conditions (i.e. with the same wrapping token). They will all try to unwrap the wrapping token they receive (POST /v1/sys/wrapping/unwrap) but only one will success. Did I miss something ?

You can set the wrap limit to whatever you like/need.

With num_uses ?

I tried this :

curl -X POST --header "X-Vault-Token: ..." --header "X-Vault-Wrap-TTL: 1m" --data '{"num_uses":2}' http://localhost:8200/v1/auth/token/create  | jq '.wrap_info.token'

and then

curl -X POST --header "X-Vault-Token: s.Q2F7YpqNSSi3JdvFaZjKfCuW"  http://localhost:8200/v1/sys/wrapping/unwrap  | jq

twice (s.Q2F7YpqNSSi3JdvFaZjKfCuW is the result of the first call).

The first call returns the client token, the second returns :

  "errors": [
    "wrapping token is not valid or does not exist"

And anyway it would only works for a fixed number of replicas deployed together.

It sounds like you’re delivering tokens to applications as part of a deployment pipeline. If your platform supports autoscaling, there may be machine identities available which would allow app instances to authenticate to Vault directly using a local meta-data service token or IAM credentials.


This allows each app instance to have its own unique token, which provides more granular audit trails, allows you to revoke secrets for individual instances/containers, and allows token lifecycles to more closely match the client lifecycle.

1 Like