We recently upgrade to the K8s version to 1.21. After upgrading, clients are unable to authenticate themselves with kubernetes authentication.
we have configured kubernetes authentication as suggested in the document: https://www.vaultproject.io/api/auth/kubernetes
The issuer field is also set. We have tried both options: setting issuer field and disabling issuer validation.
Immediately after configuring vault, the authentication works fine.
It starts failing with the error “permission denied” after vault pod is restarted.
Vault logs show the following reason for rejecting the request:
login unauthorized due to: lookup failed: service account unauthorized; this could mean it has been deleted or recreated with a new token
It starts to work again, if configuration commands are executed again.
Thanks for raising this. Judging from the linked GitHub issue, I think what’s probably happening here is the k8s token that k8s auth is configured with gets deleted when the vault-0 pod is deleted because projected volume tokens are tied to the lifetime of their associated pod.
K8S_TOKEN=$(kubectl exec vault-0 -- cat /var/run/secrets/kubernetes.io/serviceaccount/token)
...
vault write auth/kubernetes/config token_reviewer_jwt="$K8S_TOKEN" kubernetes_host="$K8S_ADDRESS" kubernetes_ca_cert="$K8S_CA" issuer="$K8S_ISSUER"
# This line then deletes the token saved in K8S_TOKEN, meaning Vault no longer has valid credentials to query the Kubernetes API with
kubectl delete pod vault-0
Broadly, you could solve this in one of two ways:
Use a long-lived token, i.e. use the service account’s default token which is stored in a k8s secret instead of the ephemeral token that gets mounted in the pod
Don’t set token_reviewer_jwt, and instead apply the system:auth-delegator role to the service accounts logging in to Vault. Without token_reviewer_jwt set, k8s auth will use the JWT passed to it during login.
Unfortunately, we do have a lot of documentation/tutorials that assumes the token in Vault’s own pod is long-lived. I’ll raise this internally and we’ll review how we want to address this for our own documentation.
Thanks for the reply! The first workaround I’d like to avoid (as it kinda feels wrong, as in “Yeah login happens with this short lived token, but btw. vault itself uses a static one to check for validity”)
The second one works if I also set disable_local_ca_jwt=true - I do not know if this is intentional? I’d assume having that on default (false) would already do exactly that - check for validity with the local token if token_reviewer_jwt is unset (and therefore also not needing an additional rolebinding for all serviceaccounts accessing vault)
It would be nice if the issue is fixed in vault. An option could be provided so that it can be configured to always use the current local token available in the pod instead of storing the JWT token available/provided when kubernetes authentication is configured.
With the current limitation, however, I prefer the first approach. Second approach requires rolebinding for all service-accounts accessing vault. This does not look like a good security practice.
Is there any update on this topic? I’m facing on the same issue when I upgrade the clusters from 1.20 to 1.21 and I’m trying to find the way to configure properly.
The only workaround is to use the legacy token which you need to pick from the secret that gets created along with the serviceaccount. You can also set disable_iss_validation=True to skip the hurdles in finding out the issuer.
The issuer problems briefly mentioned are a separate subject, but also commonly occurring since k8s 1.21. On that side, as of Vault 1.9.0, newly created Kubernetes auth mounts default to disable_iss_validation=true, as Kubernetes also validates the issuer when the TokenReview endpoint is called, so this was duplicate work anyway.
Sorry this has been a bit of a pain point. Hopefully once 1.10 lands this will return to being a much easier auth method to use for most scenarios.
Hi, @tomhjp@tsaarni1 Thanks for sharing the details. But I would like to know more about this resolution in detail. As we are facing issues with a vault in our production environment, it would be great if you can assist me in fixing it.
We are using Azure AKS 1.21.7(Recently upgraded) and Vault Version 1.7.3. For compatibility, we have updated “disable_iss_validation=true” in vault config and it worked. But intermittently, our production application pods are getting Permission denied error and in Vault logs, I can see unauthorized errors.
What I understood my multiple issues is, in k8s 1.21.7 API server is handling service account(JWT Tokens) by refreshing it every hour(3607 seconds), but the vault is not able to do the same in a dynamic way(reloading service account tokens) resulting in authorization errors.
Error From Application Pod Logs:
Error making API request.
Code: 403. Errors:
As a part of remediation, what should we do? Do we need to upgrade the vault from 1.7.3 to 1.10? or for time being is there any remediation for the Vault 1.7.3 version.
Appreciate your response, as it’s production workload.
Hi, @tsaarni1 Thank you, we will check this option. Upgrading from v1.7.3 to v1.10 will fix this issue permanently? If yes, then what change do we need to add in our below config.
I’ve attached our vault config file and pasted it above as well.
@zohebs341 The config change you need to apply in that case is to remove token_reviewer_jwt and kuberntes_ca_cert. Follow chapter “Use local service account token as the reviewer JWT” Kubernetes - Auth Methods | Vault by HashiCorp.
This change will be in v1.10.0 but it was backported to vault v1.9.3 as well.