TLS errors in k8s

stephen · May 15, 2024, 7:48pm

We are running version 1.13.2 of the vault in kubernetes (EKS) and periodically we get failures getting AWS credentials using an Ansible playbook. The ansible error indicates a 504 when making a request to the vault through a load balancer.

In the vault log, we periodically see errors that look like this:

2024-05-15T17:04:22.196Z [INFO]  http: TLS handshake error from 10.0.2.163:24948: local error: tls: bad record MAC
2024-05-15T17:04:22.231Z [INFO]  http: TLS handshake error from 10.0.3.72:30634: local error: tls: bad record MAC
2024-05-15T17:04:40.541Z [INFO]  http: TLS handshake error from 10.0.2.163:6224: local error: tls: bad record MAC
2024-05-15T17:04:42.490Z [INFO]  http: TLS handshake error from 10.0.3.72:30692: local error: tls: bad record MAC
2024-05-15T17:05:07.845Z [INFO]  http: TLS handshake error from 10.0.3.72:36104: local error: tls: bad record MAC
2024-05-15T17:05:21.850Z [INFO]  http: TLS handshake error from 10.0.3.72:40434: local error: tls: bad record MAC
2024-05-15T17:05:27.405Z [INFO]  http: TLS handshake error from 10.0.2.163:24910: local error: tls: bad record MAC
2024-05-15T17:05:40.090Z [INFO]  http: TLS handshake error from 10.0.3.72:6970: local error: tls: bad record MAC
2024-05-15T17:05:41.401Z [INFO]  http: TLS handshake error from 10.0.2.163:5040: local error: tls: bad record MAC
2024-05-15T17:06:11.659Z [INFO]  http: TLS handshake error from 10.0.3.72:32928: write tcp4 10.0.4.235:8200->10.0.3.72:32928: i/o timeout
2024-05-15T17:06:33.815Z [INFO]  http: TLS handshake error from 10.0.2.163:37822: write tcp4 10.0.4.235:8200->10.0.2.163:37822: i/o timeout
2024-05-15T17:06:38.454Z [INFO]  http: TLS handshake error from 10.0.3.72:51210: write tcp4 10.0.4.235:8200->10.0.3.72:51210: i/o timeout
2024-05-15T17:06:41.241Z [INFO]  http: TLS handshake error from 10.0.3.72:14112: write tcp4 10.0.4.235:8200->10.0.3.72:14112: i/o timeout
2024-05-15T17:06:44.731Z [INFO]  http: TLS handshake error from 10.0.2.163:37246: write tcp4 10.0.4.235:8200->10.0.2.163:37246: i/o timeout
2024-05-15T17:06:45.101Z [INFO]  http: TLS handshake error from 10.0.3.72:14106: write tcp4 10.0.4.235:8200->10.0.3.72:14106: i/o timeout
2024-05-15T17:06:54.879Z [INFO]  http: TLS handshake error from 10.0.2.163:11580: write tcp4 10.0.4.235:8200->10.0.2.163:11580: i/o timeout

Is there something that we should be looking at to solve this problem?
Our vault is backed by S3 btw.

jonathanfrappier · June 7, 2024, 8:39pm

When Ansible is failing to connect to Vault, is it a one off scenario e.g.

attempt 1: works
attempt 2: works
attempt 3: fails
attempt 4: works

Or does the error happen consistently over a period of time?

stephen · June 10, 2024, 11:11am

It seems to happen randomly. We’ve updated everything we can find in Ansible and added a retry to the requests and I think that the problem has stopped happening.

jonathanfrappier · June 11, 2024, 5:30pm

Glad I fixed it

If it seemed random, I wonder if there was some timeout in the request to AWS to get the credentials.

Topic		Replies	Views
Http: TLS handshake error from <IP:PORT> remote error: tls: bad certificate Vault k8s , vault	3	1503	October 9, 2023
TLS handshake error tls: received record with version 301 when expecting version 303 Vault k8s , vault	1	2310	March 10, 2023
[ERROR] handler: http: TLS handshake error from 10.60.158.112:33278: remote error: tls: bad certificate Vault	0	1828	July 3, 2022
TLS handshake error: Bad certificate Vault k8s	1	816	December 1, 2022
Vault K8s HA Raft Certificate Error Vault k8s , azure	2	773	January 21, 2021

TLS errors in k8s

Related topics