Vault Agent creates new runner after token renewal

Hi, We are testing vault-agent to renew certs of 1 year validity. But, we see that vault-agent gracefully restarting the runners every 3.5 hrs. Sure that there is some config error from our end. Trying to understand

Logs:

Aug 09 12:39:18 vault[26725]: test
Aug 09 12:39:18 vault[26725]: 2022-08-09T12:39:18.385Z [INFO] (runner) rendered “(dynamic)” => “/etc/ssl/test/be-test-clnt.crt”
Aug 09 12:39:18 vault[26725]: 2022-08-09T12:39:18.387Z [INFO] (runner) rendered “(dynamic)” => “/etc/ssl/test/be-test-clnt.key”
Aug 09 12:39:18 vault[26725]: 2022-08-09T12:39:18.387Z [INFO] (runner) executing command "["docker restart test\”]” from “(dynamic)” => “/etc/ssl/test/be-test-clnt.crt”
Aug 09 12:39:18 vault[26725]: 2022-08-09T12:39:18.387Z [INFO] (child) spawning: sh -c docker restart test
Aug 09 12:39:28 vault[26725]: test
Aug 09 13:21:45 vault[26725]: 2022-08-09T13:21:45.402Z [INFO] auth.handler: renewed auth token
Aug 09 14:04:22 vault[26725]: 2022-08-09T14:04:22.945Z [INFO] auth.handler: renewed auth token
Aug 09 14:47:30 vault[26725]: 2022-08-09T14:47:30.059Z [INFO] auth.handler: renewed auth token
Aug 09 15:30:07 vault[26725]: 2022-08-09T15:30:07.601Z [INFO] auth.handler: renewed auth token
Aug 09 16:13:15 vault[26725]: 2022-08-09T16:13:15.929Z [INFO] auth.handler: renewed auth token
Aug 09 16:13:15 vault[26725]: 2022-08-09T16:13:15.929Z [INFO] auth.handler: lifetime watcher done channel triggered
Aug 09 16:13:15 vault[26725]: 2022-08-09T16:13:15.929Z [INFO] auth.handler: authenticating
Aug 09 16:13:15 vault[26725]: 2022-08-09T16:13:15.959Z [INFO] auth.handler: authentication successful, sending token to sinks
Aug 09 16:13:15 vault[26725]: 2022-08-09T16:13:15.959Z [INFO] auth.handler: starting renewal process
Aug 09 16:13:15 vault[26725]: 2022-08-09T16:13:15.959Z [INFO] sink.file: token written: path=/var/vault/token/.vault-token
Aug 09 16:13:15 vault[26725]: 2022-08-09T16:13:15.959Z [INFO] template.server: template server received new token
Aug 09 16:13:15 vault[26725]: 2022-08-09T16:13:15.959Z [INFO] (runner) stopping
Aug 09 16:13:15 vault[26725]: 2022-08-09T16:13:15.960Z [INFO] (runner) creating new runner (dry: false, once: false)
Aug 09 16:13:15 vault[26725]: 2022-08-09T16:13:15.960Z [INFO] (runner) creating watcher
Aug 09 16:13:15 vault[26725]: 2022-08-09T16:13:15.960Z [INFO] (runner) starting
Aug 09 16:13:15 vault[26725]: 2022-08-09T16:13:15.961Z [INFO] (runner) received finish

And starts over again

vault-agent config below:

  pid_file = "/var/run/vault/vault-agent.pid"

  vault {
        address = "https://test"
        tls_skip_verify = true
        retry {
          num_retries = 5
        }
  }
  
  auto_auth {
          method {
            type = "approle"
            config = {
              role_id_file_path = "/etc/vault.d/agent/role-id"
              secret_id_file_path = "/etc/vault.d/agent/secret-id"
              remove_secret_id_file_after_reading = false
            }
          }
  
     sink {
            type = "file"
            config = {
              path = "/var/vault/token/.vault-token"
            }
          }
  }

    cache {
          use_auto_auth_token = true
    }

template_config {
  static_secret_render_interval = "48h"
}

template { }

I suspect the vault app role token policy.
Looking for suggestions on how should be the vault approle configured for our usecase

what’s your TTL on the approle?

Vault agent / consul-template doesn’t have good support for PKI certificates, as when set up in the way you’d naturally expect, it has no support for checking if the certificate on disk is still valid and should continue to be used.

A bit of a hacky workaround was recently added but it requires you to rewrite your template in an unusual way, for which the only documentation I could find using Google is this comment in a GitHub issue: Consul template ignoring rendered file, keeps generating certs at every reload or restart · Issue #1597 · hashicorp/consul-template · GitHub

It isn’t exactly restarting, it’s just going through it’s loop, in which is probably half the time of your auth ttl.

Cert renewal isn’t exactly in the wheel house of templating language – although it can be done it isn’t an easy task. As far as the cert itself – you can certainly output it as a file but I don’t know of any application that can re-read a cert and implement it on the fly, so what does the short TTL buy you?

Hi, Sorry for the delayed response. I had to tune the approle to not generate the tokens which expire frequently and it solved the problem for now

I came across this in the vault-agent documentation, this will help us without updating the vault approle TTL configs

I am wondering how to use pkiCert in my template. There is no example. Any pointers in this end ?

Thanks a lot for all your time and suggestions

I think i found it. It is in this documentation

Hi,
I have a question.

I’m working on integrating HashiCorp Vault into our application using Vault Agent for authentication. The initial setup works well, where the application reads the Vault token from a file generated by Vault Agent and uses it to authenticate with the Vault server.

However, I’m concerned about handling scenarios where the token’s max TTL is reached, and a new token is generated by Vault Agent.

Currently, our application reads the token once during initialization and uses it for subsequent operations. If the token expires and a new one is generated, the application wouldn’t automatically know about the new token, which could lead to failed operations.

To address this, I am thinking to implement a file watcher that monitors the token file for changes. When a new token is generated, the watcher reloads the token and updates the Vault client. While this seems to work in theory, I want to ensure that we’re following best practices and not missing any important considerations.

Here are the specific questions I have:

  1. Is monitoring the token file for changes and reloading the token dynamically the recommended approach for handling token renewal with Vault Agent?
  2. Are there any potential pitfalls or edge cases I should be aware of when implementing this solution?
  3. Are there more efficient or reliable methods to ensure the application always has access to a valid token, especially in high-availability or production environments?

I’d appreciate any feedback or suggestions on improving this implementation.