Intermitent auth.handler: error authenticating: error="context deadline exceeded"

maxisme · March 21, 2023, 2:21pm

I am sometimes getting the error on the vault agent init:

[INFO] auth.handler: authenticating

[1 minute later] [ERROR] auth.handler: error authenticating: error=“context deadline exceeded”

When connecting to vault. This can carry on retrying for over 10 minutes. It always resolves it self though.

On the vault pod side I am getting:

[INFO] http: TLS handshake error from 10.242.3.100:33430: EOF

I am guessing because of the 1 minute timeout.

Any idea how to debug further? - The vault pods do not seem to be exhausting resources, s3 seems fine…

I am also using eks.

macmiranda · March 21, 2023, 2:38pm

Is this the only “error” message you get on the Vault pods?

maxisme · March 21, 2023, 2:38pm

around that time yes!

maxisme · March 21, 2023, 2:39pm

But I do also see this error sometimes:

[ERROR] core: error during forwarded RPC request: error=“rpc error: code = Canceled desc = context canceled”

macmiranda · March 21, 2023, 2:41pm

Can you post the anonymized version of the Vault agent annotations in the Pod spec? (most of it should be safe to post here anyway)

maxisme · March 21, 2023, 2:45pm

vault.hashicorp.com/agent-init-first: 'true'
    vault.hashicorp.com/agent-inject: 'true'
    vault.hashicorp.com/agent-inject-default-template: json
    vault.hashicorp.com/agent-inject-secret-**foo**: >-
      some/secret/**foo**
    vault.hashicorp.com/agent-inject-status: injected
    vault.hashicorp.com/agent-inject-token: 'true'
    vault.hashicorp.com/agent-pre-populate-only: 'true'
    vault.hashicorp.com/agent-requests-cpu: 10m
    vault.hashicorp.com/agent-run-as-user: '1000'
    vault.hashicorp.com/ca-cert: /vault/tls/ca.crt
    vault.hashicorp.com/namespace: ''
    vault.hashicorp.com/role: **some-role**
    vault.hashicorp.com/tls-secret: **some-secret-ca-crt**

there are lots of vault.hashicorp.com/agent-inject-secret-foo: >- some/secret/foo

I put ** around everything that was redacted

macmiranda · March 21, 2023, 2:55pm

10.242.3.100:33430

Did you confirm this was indeed the Pod’s IP? Just trying to eliminate issues that are not related.

Can you share your Vault Kubernetes auth backend config? Have you followed the docs?

Which of these methods are you using?

macmiranda · March 21, 2023, 2:58pm

This sounds a lot like an issue we had recently. It had absolutely nothing to do with the authentication config. The backend becomes unavailable because of DynamoDB throttling.

You could check that.

maxisme · March 21, 2023, 3:07pm

Yes. I did it is the same.

maxisme · March 21, 2023, 3:08pm

I use s3 not dynamodb - and it looks fine.

maxisme · March 21, 2023, 3:09pm

FYI this only happens during startup of the pod!

macmiranda · March 21, 2023, 3:24pm

Maybe check AWS Instance Metadata Timeouts (though EKS nodes should already have max hop limit set to 2)

macmiranda · March 21, 2023, 3:27pm

How did you deploy Vault? Using the official Helm chart?

maxisme · March 21, 2023, 3:57pm

yes we do using the official helm chart

maxisme · March 21, 2023, 3:57pm

yeah unfortunately we have already optimised this too!

maxisme · March 21, 2023, 4:05pm

+-----------------------------------+---------------------------------+
| Field                             | Value                           |
+-----------------------------------+---------------------------------+
| Alias name                        | serviceaccount_uid              |
| Source                            |                                 |
| Audience                          |                                 |
| Bound service account names       | REDACTED                        |
| Bound service account namespaces  | *                               |
| Tokens                            |                                 |
| Generated Token's Bound CIDRs     |                                 |
| Generated Token's Explicit Maximum TTL | 0                           |
| Generated Token's Maximum TTL     | 0                               |
| Do Not Attach 'default' Policy To Generated Tokens | false         |
| Maximum Uses of Generated Tokens  | 0                               |
| Generated Token's Period          | 0                               |
| Generated Token's Policies        | default,tf-eks-dev-eu-west-2-schedule |
| Generated Token's Initial TTL     | 86400                           |
| Generated Token's Type            | default                         |
+-----------------------------------+---------------------------------+

macmiranda · March 21, 2023, 4:06pm

Think you need to use pre-formatted text or just paste the redacted version of the vault read auth/kubernetes/config output

maxisme · March 21, 2023, 4:27pm

edited sorry - is that easier to understadnd?

macmiranda · March 21, 2023, 4:37pm

So I was trying to establish which of the following you used for the Kubernetes auth:

Option	All tokens are short-lived	Can revoke tokens early	Other considerations
Use local token as reviewer JWT	Yes	Yes	Requires Vault (1.9.3+) to be deployed on the Kubernetes cluster
Use client JWT as reviewer JWT	Yes	Yes	Operational overhead
Use long-lived token as reviewer JWT	No	Yes
Use JWT auth instead	Yes	No

maxisme · March 21, 2023, 6:02pm

it is Use long-lived token as reviewer JWT

Topic		Replies	Views
Error Authenticating Vault Agent context deadline exceeded in New Installation Vault	4	722	July 18, 2022
Vault init container authorization fails when running multiple pods (error="context deadline exceeded") Vault k8s , vault	1	938	December 22, 2021
Vault-init-container error authenticating: error=“context deadline exceeded” backoff=3m4 Vault	3	320	April 25, 2022
Kubernetes vault-agent-init sidecar error "context deadline exceeded" Vault k8s , vault	15	12678	December 21, 2021
Vault-agent-init authentication error, context deadline exceeded Vault	1	1540	August 11, 2021

Intermitent auth.handler: error authenticating: error="context deadline exceeded"

Related topics