"context canceled"

igor.gumush · August 23, 2021, 11:36am

Hi,
We are using 2xVault and 3xconsul arcitecture.
Resently we had a downtime on our service, so we tring to understand what exacly happened.

Vault were responding with 500 to incomming requests (around 10000 requests), but in the audit log we see only ~300 errors, they all related to “failed to read … contex canceled”.
From the consul documentation it seems like this error can be related to IOPS, but in our case we are far from the limit.
The issue was resulved by itself within 15min.

No one of the vault/counsul process failed.
No sipke in traffic.
Seems like, no fault on the aws connected disks.
We are running on aws ec2 vms and 2 elbs.

Questions:

What could cause this issue?
Why if there is an issue with one of the consul/vaults there was no fallback to the standby one?
Is there any way to avoid such issue in the future?

jeffsanicola · August 23, 2021, 12:25pm

Are you able to get the full error message? The “failed to read … context canceled” seems to be missing some data in the middle that could be useful for assisting with your issue.

However, my first thought would be to see if there was any disruption in the ability to write to your audit devices. If Vault is unable to write to all configured audit devices then it will not allow any operations to process. More info on that here: Audit Devices | Vault by HashiCorp

igor.gumush · August 23, 2021, 1:15pm

Few error logs:

/var/vcap/data/sys/log/vault/vault_audit.log
{
“time”:“2021-08-17T18:48:31.463624563Z”,
“type”:“response”,
“request”:{
“id”:"…",
“operation”:“update”,
“client_token”:"…",
“client_token_accessor”:"…",
“path”:“auth/token/renew-self”
},
“response”:{},
“error”:"1 error occurred:\n\t failed to read lease entry auth/token/create/h2b9ec5149c9397105c7bb97faeb6f80c8aa0a0c9c59915bbea4620f107e8f872: Get https://internal-consul-213558487.us-west-2.elb.amazonaws.com/v1/kv/vault/sys/expire/id/auth/token/create/h2b9ec5149c9397105c7bb97faeb6f80c8aa0a0c9c59915bbea4620f107e8f872: context canceled\n\n"*
}
/var/vcap/data/sys/log/monit/vault.err.log:66717:2021-08-17T18:34:59.014Z [ERROR]
core: failed to run existence check:
error=“existence check failed: Get https://internal-consul-238678873.us-west-2.elb.amazonaws.com/v1/kv/vault/logical/b831a89a-b02f-2085-df9b-cc44d5dc9b2d/81f37567-f14c-4289-817b-57b15ee24d2e/078221f7-da65-491c-9185-4d3f47442e9f/ee744fd9-bdfb-4a9b-bd6b-649b5adea0a2: context canceled”

checked the ‘audit devices’ with:
curl --header “X-Vault-Token:…” https://127.0.0.1:8200/v1/sys/audit
result attached.
audit_log.txt (1010 Bytes)

seems like we are writing to a local disk.
Then syslog is sending this data to splunk. so incase of network issue vault should just continue writing to disk rihgt? (unless there is an issue on the disk itself). right?

jeffsanicola · August 23, 2021, 1:27pm

Yes, that is my understanding.

The errors look to be related to an issue communicating with your Consul storage backend. I checked the AWS Status page for any outages in us-west-2 EC2 or ELB (https://status.aws.amazon.com/) but didn’t see any documented outages around the time you had trouble.

Are you able to pull logs out of Consul for around the same time period to see if anything was happening within your storage environment?

igor.gumush · August 23, 2021, 1:54pm

Right, i checked also with aws they are saying no faults on their side.
There is also nothing relevant in consul logs.

also, anyway if there is an issue on the consul, I would be expecting to switch to another one. 15 min is a lot of time. shouldn’t that happen?

jeffsanicola · August 23, 2021, 7:56pm

I’m much less familiar with Consul than I am with Vault. Maybe one of the HashiCorp crew can offer some perspective on that?

igor.gumush · August 23, 2021, 9:06pm

shouldn’t they respond here? or should i ask somewhere else?

jeffsanicola · August 23, 2021, 9:09pm

If you’re using Vault Enterprise then I’d suggest opening a ticket in the support portal as you’re guaranteed a response there. Otherwise one of the HashiCorp staff may happen upon this thread and respond as they have time.

igor.gumush · August 23, 2021, 9:24pm

We are using the free version for now.
Thanks. i’ll wait.

Topic		Replies	Views
Context Canceled Error on MYSQL mounts with Raft backend Vault raft	2	494	January 27, 2022
Error "context canceled" without the reason I'm able to find Vault	8	10456	September 15, 2023
Error closing connection: context canceled Vault	0	339	December 14, 2020
Internal Server Error - Active Context Canceled Vault vault	1	586	July 3, 2023
Vault operator migrate keeps failing with `context canceled` Vault vault	6	526	August 22, 2023

"context canceled"

Related topics