Response wrapping and horizontal scaling

ahlvb · October 10, 2021, 1:23pm

Dear all,

Is there any recommendation for response wrapping token when the production application is eligible for autoscaling ? Autoscaling will result in new deployments but since the response exchange token is a short time one time token I cannot see how additional deployment can success. Any best practices ?

Thanks !

aram · October 11, 2021, 9:15am

I just want to say up front that I cannot recommend auto-scaling Vault, not for performance reasons anyway. You can, and should use it for rolling upgrades and simple instance instability.

That said, really the two have no connection other than one can trigger the other. You cannot expect that a connection can trigger an auto-scale and that the new instance will respond to the request. A request will tip the scales of response time based on whatever you defined, a new instance will be instantiated and start, join the raft (or directly become a perf. standby). It may or may not trigger an election at that point.

ahlvb · October 12, 2021, 11:58am

Thanks for your answer, but actually I was writing about the autoscaling of the application, not vault itself. Sorry if it was not clear.

If autoscaling and/or high availability is a requirement, the application is deployed more than once in the same conditions (i.e. with the same wrapping token). They will all try to unwrap the wrapping token they receive (POST /v1/sys/wrapping/unwrap) but only one will success. Did I miss something ?

aram · October 12, 2021, 7:57pm

You can set the wrap limit to whatever you like/need.

ahlvb · October 14, 2021, 5:56pm

With num_uses ?

I tried this :

curl -X POST --header "X-Vault-Token: ..." --header "X-Vault-Wrap-TTL: 1m" --data '{"num_uses":2}' http://localhost:8200/v1/auth/token/create  | jq '.wrap_info.token'

and then

curl -X POST --header "X-Vault-Token: s.Q2F7YpqNSSi3JdvFaZjKfCuW"  http://localhost:8200/v1/sys/wrapping/unwrap  | jq

twice (s.Q2F7YpqNSSi3JdvFaZjKfCuW is the result of the first call).

The first call returns the client token, the second returns :

{
  "errors": [
    "wrapping token is not valid or does not exist"
  ]
}

And anyway it would only works for a fixed number of replicas deployed together.

jmartinson · October 14, 2021, 6:34pm

It sounds like you’re delivering tokens to applications as part of a deployment pipeline. If your platform supports autoscaling, there may be machine identities available which would allow app instances to authenticate to Vault directly using a local meta-data service token or IAM credentials.

Examples:

This allows each app instance to have its own unique token, which provides more granular audit trails, allows you to revoke secrets for individual instances/containers, and allows token lifecycles to more closely match the client lifecycle.

Topic		Replies	Views
Using vault on VM ( VistualMachine) Vault	1	811	December 20, 2021
AppRole response wrapping without trusted orchestrator in between Vault	2	659	September 14, 2023
approleID + wrapped secretID vs. wrapped Vault token Vault	0	429	August 5, 2021
Response Wrapping in Kubernetes Auth Method Vault	1	430	May 24, 2020
Why choose a response-wrapped token vs. a time limited 1 use token? Vault	4	1059	December 13, 2020

Response wrapping and horizontal scaling

Related topics