Huge number of leases while using /auth/kubernetes

lukpep · March 16, 2022, 4:37pm

We have vault deployed on k8s cluster and integrated with several other k8s clusters using CSI driver to provide secrets to workloads there - it all works ok but…:
list of leases for every cluster is several thousand pages long… e.g.

which corresponds to the huge size of vault.db:

/vault/data $ ls -lh
total 8G
drwxrws---    2 root     vault      16.0K Nov 15 19:42 lost+found
-rw-rw----    1 vault    vault         36 Nov 15 19:42 node-id
drwxrwsr-x    3 vault    vault       4.0K Nov 15 19:42 raft
-rw-------    1 vault    vault       8.0G Mar 16 16:33 vault.db

and pods memory consumption:

(⎈ |gke-devops-prod:vault) ~  k top pods
NAME                                  CPU(cores)   MEMORY(bytes)
in-cluster-vault-0                    13m          79Mi
in-cluster-vault-1                    28m          369Mi
in-cluster-vault-2                    63m          9557Mi

Is it expected? We don’t have like a big number of workloads there (maybe 20-30 with vault integration) - vault is being hit with no more than 1.5-2 RPS (not 2k rps - just 2 rps)

I believe it also translates to the fact that vault takes forever to restart / roll a new version - like 90-120 min per single pod and while this is happening other pod resource consumption goes to 11GB of data …
What could be a misconfiguration on our side?

RemcoBuddelmeijer · March 16, 2022, 5:56pm

Hey!

What Kubernetes version are you using? And what authentication approach for Kubernetes auth? And what Vault version?

lukpep · March 16, 2022, 5:58pm

k8s 1.22 - “v1.22.6-gke.300” to be precise.
Vault v1.9.3
Kubernetes - Auth Methods | Vault by HashiCorp ← this auth method

RemcoBuddelmeijer · March 16, 2022, 5:59pm

Alright and out of the available approaches to auth (listed here)?

lukpep · March 16, 2022, 6:08pm

I’m using terraform to configure vault - current config is:

resource "vault_auth_backend" "k8s-apps-dev" {
  type = "kubernetes"
  path = "k8s-apps-dev"
  tune {
    default_lease_ttl = "1h"
    max_lease_ttl     = "1h"
  }
}

resource "vault_kubernetes_auth_backend_config" "k8s-apps-dev-config" {
  backend            = vault_auth_backend.k8s-apps-dev.path
  kubernetes_host    = "https://35.195.xxx.xxx"
  issuer             = "https://container.googleapis.com/v1/projects/gcp-project-name/locations/europe-xxxxx/clusters/cluster-name"
  kubernetes_ca_cert = file("auth_data/apps-dev-cluster/ca_cert.crt")
  token_reviewer_jwt = file("auth_data/apps-dev-cluster/token.jwt")
}

tune section was added today - as a test if this helps - previously it was 0 - which defaulted to 768h of lease I believe

RemcoBuddelmeijer · March 16, 2022, 9:27pm

Could you check the amount of entities that are bound to the auth method? If it’s more than expected, please have a look at the alias/service account UID used.
I will have a look at your lease settings as this seems to possibly be the issue.

lukpep · March 17, 2022, 6:40am

no more than 10-20 entities for all clusters in total - for auth/Kubernetes method

lukpep · March 18, 2022, 3:33pm

any ideas @RemcoBuddelmeijer ?
I’ve reconfigured k8s auth method to have a max lease of 60s and I’m waiting to see if ~500 000 old leases will end… either way there is something not quite ok with this native k8s integration - why these leases were never revoked?

RemcoBuddelmeijer · March 18, 2022, 3:47pm

@lukpep Allow me some time to dive into it. Had it planned for today and will let you know more shortly!

RemcoBuddelmeijer · March 18, 2022, 8:24pm

@lukpep I have had a look at all the components that are being used, this means:

Kubernetes Auth
CSI provider

As of now nothing odd has happened to me. Whenever I use the CSI provider on your exact Vault version, it all goes smoothly and only a singular lease is created.
However, this was not the case whenever a secret could not be read. Rather than 1 it would create multiple leases. One for each time it retried, but this was a finite amount and leases expired. Perhaps checking if all the secrets are read at first try within timeout through audit logging and debug logging would bring more to light?

One thing that I did want to ask you is to check your secrets store CSI driver version. Could you share this perhaps?

Other than that I didn’t know anything specific about your setup of Vault. This makes it very hard to judge what goes wrong. 2 RPS could still mean some misconfiguration that might not be caught in your specific metric.
For this I would have to know more. And since this is sensitive information I can understand if it’s out of reach. For this it’s either up to you to reach out to HashiCorp themselves or to share it either way. (If you were to share this information, please do make sure it’s cleared by whoever is in charge and disclose it securely. I recommend against sharing it in general, as it’s your own personal setup of Vault!)

Sorry if this wasn’t what you wanted to hear. CSI Driver seems to function as expected on the latest (Helm) version with no unordinary test cases.

lukpep · March 20, 2022, 12:24pm

@RemcoBuddelmeijer thanks for Your time
Regarding CSI driver - I’m using 1.0.0 from Secrets Store CSI Driver Helm Chart Repository - and I can see that the newest one is 1.1.1
What I was able to see while looking at nginx ingress controller logs (behind which is my vault)

ingress-nginx-controller-6c9594575f-2hll2 controller 10.97.128.107 - - [20/Mar/2022:11:51:28 +0000] "POST /v1/auth/k8s-apps-prod/login HTTP/2.0" 200 710 "-" "Go-http-client/2.0" 1363 0.346 [vault-in-cluster-vault-active-8200] [] 192.168.128.49:8200 710 0.346 200 e0ba5daba19168ad062fd845d6f16e91
ingress-nginx-controller-6c9594575f-2hll2 controller 10.97.128.107 - - [20/Mar/2022:11:51:28 +0000] "GET /v1/app-secrets/data/some-random-app/prod HTTP/2.0" 200 2938 "-" "Go-http-client/2.0" 71 0.002 [vault-in-cluster-vault-active-8200] [] 192.168.128.49:8200 2950 0.001 200 d1d06f156dd92657f35b554a8bd4a675
--
ingress-nginx-controller-6c9594575f-2hll2 controller 192.168.130.1 - - [20/Mar/2022:11:51:33 +0000] "POST /v1/auth/k8s-apps-prod/login HTTP/2.0" 200 710 "-" "Go-http-client/2.0" 1363 0.494 [vault-in-cluster-vault-active-8200] [] 192.168.128.49:8200 710 0.494 200 b7efe65cb03bc79fc0218dac8a32d1be
ingress-nginx-controller-6c9594575f-2hll2 controller 192.168.130.1 - - [20/Mar/2022:11:51:33 +0000] "GET /v1/app-secrets/data/some-random-app/prod HTTP/2.0" 200 2938 "-" "Go-http-client/2.0" 71 0.002 [vault-in-cluster-vault-active-8200] [] 192.168.128.49:8200 2950 0.002 200 0a2383fd543179ef56ae37d335a70ceb
--
ingress-nginx-controller-6c9594575f-2hll2 controller 10.97.128.107 - - [20/Mar/2022:11:53:28 +0000] "POST /v1/auth/k8s-apps-prod/login HTTP/2.0" 200 710 "-" "Go-http-client/2.0" 1363 0.250 [vault-in-cluster-vault-active-8200] [] 192.168.128.49:8200 710 0.250 200 bd7b83ccaee69f75714bd781d9c5cf4d
ingress-nginx-controller-6c9594575f-2hll2 controller 10.97.128.107 - - [20/Mar/2022:11:53:28 +0000] "GET /v1/app-secrets/data/some-random-app/prod HTTP/2.0" 200 2938 "-" "Go-http-client/2.0" 72 0.002 [vault-in-cluster-vault-active-8200] [] 192.168.128.49:8200 2950 0.001 200 71a811fa7f1c7c6d3213fcfdada18fc9
--
ingress-nginx-controller-6c9594575f-2hll2 controller 192.168.130.1 - - [20/Mar/2022:11:53:33 +0000] "POST /v1/auth/k8s-apps-prod/login HTTP/2.0" 200 710 "-" "Go-http-client/2.0" 1363 0.447 [vault-in-cluster-vault-active-8200] [] 192.168.128.49:8200 710 0.446 200 f1b278a2f151da20ff84bc4d1451ded0
ingress-nginx-controller-6c9594575f-2hll2 controller 192.168.130.1 - - [20/Mar/2022:11:53:33 +0000] "GET /v1/app-secrets/data/some-random-app/prod HTTP/2.0" 200 2938 "-" "Go-http-client/2.0" 72 0.002 [vault-in-cluster-vault-active-8200] [] 192.168.128.49:8200 2950 0.001 200 15d15e99cfc6afb015550dfcc0089d09
 --
ingress-nginx-controller-6c9594575f-2hll2 controller 10.97.128.58 - - [20/Mar/2022:11:55:28 +0000] "POST /v1/auth/k8s-apps-prod/login HTTP/2.0" 200 710 "-" "Go-http-client/2.0" 1363 0.251 [vault-in-cluster-vault-active-8200] [] 192.168.128.49:8200 710 0.251 200 48fbb65745254e20ed1498e0d6bffbfc
ingress-nginx-controller-6c9594575f-2hll2 controller 10.97.128.58 - - [20/Mar/2022:11:55:28 +0000] "GET /v1/app-secrets/data/some-random-app/prod HTTP/2.0" 200 2938 "-" "Go-http-client/2.0" 71 0.002 [vault-in-cluster-vault-active-8200] [] 192.168.128.49:8200 2950 0.001 200 557a728936f334c514955b72dabb09a7
--
ingress-nginx-controller-6c9594575f-2hll2 controller 10.97.128.107 - - [20/Mar/2022:11:55:33 +0000] "POST /v1/auth/k8s-apps-prod/login HTTP/2.0" 200 710 "-" "Go-http-client/2.0" 1363 0.588 [vault-in-cluster-vault-active-8200] [] 192.168.128.49:8200 710 0.587 200 28e02dea3d629bf28c4ca522f5a82136
ingress-nginx-controller-6c9594575f-2hll2 controller 10.97.128.107 - - [20/Mar/2022:11:55:33 +0000] "GET /v1/app-secrets/data/some-random-app/prod HTTP/2.0" 200 2938 "-" "Go-http-client/2.0" 71 0.001 [vault-in-cluster-vault-active-8200] [] 192.168.128.49:8200 2950 0.002 200 0e734c6f5a1464a8d072fad83bfb1ddd
--
ingress-nginx-controller-6c9594575f-2hll2 controller 10.97.128.58 - - [20/Mar/2022:11:57:27 +0000] "POST /v1/auth/k8s-apps-prod/login HTTP/2.0" 200 710 "-" "Go-http-client/2.0" 1363 0.260 [vault-in-cluster-vault-active-8200] [] 192.168.128.49:8200 710 0.259 200 ad31a9aeef26a669b77086ee61a19328
ingress-nginx-controller-6c9594575f-2hll2 controller 10.97.128.58 - - [20/Mar/2022:11:57:27 +0000] "GET /v1/app-secrets/data/some-random-app/prod HTTP/2.0" 200 2938 "-" "Go-http-client/2.0" 71 0.002 [vault-in-cluster-vault-active-8200] [] 192.168.128.49:8200 2950 0.002 200 85ed2c01b37c850a92b2c1ba7279be91
--
ingress-nginx-controller-6c9594575f-2hll2 controller 192.168.130.1 - - [20/Mar/2022:11:57:33 +0000] "POST /v1/auth/k8s-apps-prod/login HTTP/2.0" 200 710 "-" "Go-http-client/2.0" 1363 0.521 [vault-in-cluster-vault-active-8200] [] 192.168.128.49:8200 710 0.521 200 1af4e2487e9fd2d9eceddd5b2f422b7b
ingress-nginx-controller-6c9594575f-2hll2 controller 192.168.130.1 - - [20/Mar/2022:11:57:33 +0000] "GET /v1/app-secrets/data/some-random-app/prod HTTP/2.0" 200 2938 "-" "Go-http-client/2.0" 70 0.002 [vault-in-cluster-vault-active-8200] [] 192.168.128.49:8200 2950 0.002 200 9b308624d5351ea9e28f24deb21d9b10

No secret data leaked here - I’ve checked
Every 2 minutes we have login POST and GET secrets… for some reason, this is repeated in the next 5-6 seconds. Every login creates a new token and new lease I assume?
When it comes to this specific app config it’s using secret provider object:

apiVersion: secrets-store.csi.x-k8s.io/v1alpha1
kind: SecretProviderClass

configured with 5 keys - all coming from a sing secret path -

/v1/app-secrets/data/some-random-app/prod

What troubles me is:

why this pattern is repeated twice every 2 minutes? It seems like single login and get secret should be enough right?
while I understand why we need to check secrets every 2 minutes (secret rotation) it looks like without some kind of token caching this solution will scale poorly - we are talking about ~ 45k lease objects per month / per application (per SecretProviderClass to be exact - not sure how it will behave if we have multiple secret paths - and not only keys - under same SecretProviderClass object) - 20 apps per cluster x 4 clusters (nothing extraordinary I believe) and we have close to 4 million lease objects per month - which in our case (extrapolating from near 2 million we have) translates to vault instance with > 20 GB memory used and startup / restart times counted in hours.

So my question is - should we shorten TTL on these leases? from 1 month to 1 minute to lets say somehow keep the number of it under control? Or maybe constant tokens rewokes every 1 minute will kill the CPU?

lukpep · March 20, 2022, 1:42pm

ok - i know why this login / get pattern gets repeated after 5s - this is done for every single pod in a replica set separately… and this particular service has 2 replicas. When I scaled it to 3 I have 3x login and get secret. Not so optimal I must say

RemcoBuddelmeijer · March 20, 2022, 2:30pm

Looks to me like you might be better off using the Vault Agent rather than the CSI driver for the time being. I will have a look at the Vault CSI Provider and see what can be done to improve upon this.
The issue here really seems to be in the authentication part rather than any type of secret caching. Caching will improve the provider a lot, but the leases are a huge deal as they are being tracked in memory. A lease shouldn’t have to be created every 5s, not even every 2m.

Would having a look at the Vault Agent be something you’d be interested in?

lukpep · March 20, 2022, 2:47pm

I was not a big fan of agent (last time I’ve checked it) since it required extra pod per secret aware workload. Will validate it once again.
What is “broken” in the current CSI driver implementation in my opinion:

not making use of the lease TTL - instead rotation-poll-interval from here is what defines the number of leases created per hour / month etc. Token created via single login should be cached and reused for the TTL it was created with
CSI driver should not make requests (and logins) per pod in the replica set - it is counter-intuitive that deployment of 100 pods enforces 100 logins and 100 GETs for the same secret once every rotation-poll-interval (2 minutes by default) - it also creates inconsistency in the secret itself since this sync interval is bound to pod lifetime (counter starts when pod is created) and therefore can be a time period when the secret was refreshed in some portion of pods and not yet in other - and in max, it could last to rotation-poll-interval once again - which is definitely not desired.

RemcoBuddelmeijer · March 20, 2022, 3:08pm

If having a sidecar for every single on of your deployments isn’t an option, then that sadly enough leaves that.

I 100% agree, and I think this should be fixed in a way that it survives updates of the Secret Store CSI Driver. Right now I see a lack of API usage and rather see some than just API objects.
From looking at the GitHub Issues it does seem like they are aware of this, and have a plan in the future to work towards this. Either way this should be fixed at least momentarily.

I think this is where it starts becoming a bit hard. V1.0.0 has just been released and with that the first stable release of the CSI Driver itself. A lot of things couldn’t exactly have been put into place as there either wasn’t enough time or wasn’t sure what might have been introduced and what not.
Time will fix these issues as I am sure that Vault team is aware that making 100 requests per 100 pods isn’t do-able.

How about we start off by creating a (number of) issue(s) on the GitHub repository and link this thread? I can do this after having done some more research into the Vault CSI Provider itself.

lukpep · March 23, 2022, 9:40am

created Vault CSI provider not making use of received token TTL · Issue #150 · hashicorp/vault-csi-provider · GitHub
and Login and secret sync pattern per pod in replica set · Issue #151 · hashicorp/vault-csi-provider · GitHub

Topic		Replies	Views
Vault constantly consumes all memory on host - Too many leases? Vault	1	1492	November 5, 2020
Vault k8 Pod taking too much memory Vault k8s , vault	2	377	April 12, 2023
Vault update from 1.8.2 to 1.10.2: High memory usage (400 MB to 8 GB) Vault k8s , vault	1	549	May 26, 2022
Vault read performance issue with Kubernetes role with high number of policy Vault k8s	0	443	March 31, 2022
Kubernetes auth generates many lease revocations Vault k8s , vault	1	39	December 11, 2024

Huge number of leases while using /auth/kubernetes

Related topics