I have set up Vault to generate ephemeral MongoDB credentials for an application running on Kubernetes. My current solution works well, but it does not have zero downtime. Once a request to MongoDB fails because the credentials expired, the application makes a request to Vault to generate a new set of credentials. While this work, all requests to the application from the moment the credentials expire to the moment Vault generates the new ones will fail.
I was considering setting a TTL of let’s say 2 weeks and then using consul-template with a grace period of 1 week, so that a new key would be generated well before the existing one expires. Then the application would start using the new key. I believe this approach should have zero downtime.
The problem is that the grace period in consul-template (and vault) was removed quite a while ago. See:
I can think of any solution as simple as using the grace period to avoid downtime. Does consul-template (or maybe vault-agent itself) provide now any way to generate new credentials before the currently existing ones expire?
The renewing/reauthentication logic automatically triggers at a certain point in the lifetime of the lease and, on failure, keeps scheduling itself to try again. Once it reaches a certain threshold where renewals no longer extend TTL it will attempt to acquire a new lease. Grace period only existed because the old logic was much less capable.
Is it guaranteed that the renewing/reauthentication logic will trigger well before the TTL expires? For instance, a trigger taking place one week after the credentials were obtained (and one week before their expiration) for a 2-week TTL.
Are these details (or the configuration pieces that control them) documented anywhere?
I first thought that the creation of a new credential will happen when the current one can’t be renewed for a full TTL.
like:
TTL: 2m
MAX TTL 5m:
t-0: create cred1
t+2: extend lease of cred1
t+4: cred1 can’t be extended for 2 more mintues, extend it until MAX TTL = 1m and create cred2
But it’s not the case…
I can see in the logs that the extend/renew happen almost at any time without regard to the TTL period, and the creation of a new credential happen at any moment but not the first time when the TTL can’t be extended for the full period
I’m digging in the code but i’m about to open an issue on that…
Any help welcomed.
(ps: I’m using the Vault 1.4.0 and the Postgres database plugin, tests have been done with TTLs from 12h to 4m for testing)