How to securely authenticate Vault Agent for long-running apps without a static secret_id?

Hi everyone,

I am configuring Vault Agent for a long-running application and hitting a roadblock with the “Secret Zero” problem when using the AppRole auth method.

I want Vault Agent to automatically fetch brand-new tokens over a long period of time (handling Max TTL expirations) using pure, native Vault Agent automation. I strictly want to avoid writing custom wrapper scripts to manage the lifecycle.

To allow the Vault Agent to continuously re-authenticate and get new tokens when the old ones hit their Max TTL, it seems I have to keep the secret_id permanently stored in the vault.hcl file. However, leaving the secret_id permanently on disk essentially turns it into a static password, which defeats the purpose of using Vault for dynamic credentials.

What is the recommended, Vault-native best practice for long-running machines? Should I completely abandon AppRole for this use case and move to platform-based authentication (like Kubernetes Auth for pods or AWS Auth for EC2), or is there a secure, native way to use AppRole continuously without leaving a static secret_id file on the disk?

Thanks in advance for the guidance!

1 Like

Where does the Vault Agent run? A platform based auth method such as Kubernetes, AWS, Azure, etc could be a better option.

Thanks for the suggestion!

The Vault Agent is running on a long-lived VM (non-Kubernetes environment).

In our case, this is a banking / highly restricted (air-gapped) setup, so we don’t have access to platform-based identity providers like AWS, Azure, or Kubernetes.

That’s why I’m exploring AppRole specifically, but trying to understand how to handle re-authentication securely without persisting a static secret_id.

Would love to know if there are recommended patterns for such environments.

All of these systems basically have an implicit, secure out of band delivery - Identity tokens get delivered, refreshed or queried by the “oversight system” - so this is what you need.

This comes in the way of a TPM ( vTPM ) infrastructure, bootstrap certificates ( installed and registered before anything gets access to the machine ) - that can fetch a new token to be used automatically.

This is why SPIFFE support in the latest release is interesting - as it does some of this, and can provide a standardized non-human-identity.

One key item though - if someone has admin access to the systems they can have access to these tokens. This is the same for the cloud system - give exec to a pod in kuberntes, they can pull the SA token and use it.

While that part of security stays the same - if you have a proper PAM flow ( access must be requested, and be time limited ), you now have a way of revoking and issuing new machine tokens.

Hope that gives you some direction.