Perodicaly permission denied using approle

To automate tasks I use approle auth method to authenticate to vault. Sometimes I got an error “403 permission denied”, but role-id/secret-id is correct. Later I can authenticate successfully.
Logs show {"time":"2024-01-16T20:03:31.412408432Z","type":"response","auth":{"token_type":"default"},"request":{"id":"b515d71e-c865-5e9a-d7c8-2ed435697199","client_id":"xxx","operation":"update","mount_point":"auth/approle/","mount_type":"approle","mount_accessor":"auth_approle_d55763f2","client_token":"hvs.xxx","client_token_accessor":"xxx","namespace":{"id":"root"},"path":"auth/approle/login","data":{"role_id":"xxxx","secret_id":"xxx"},"remote_address":"10.244.160.143","remote_port":42980},"response":{"mount_point":"auth/approle/","mount_type":"approle","mount_accessor":"auth_approle_d55763f2"},"error":"permission denied"}
I didn’t enable Rate limit quotas. Even if the rate limit was configured, enable_rate_limit_audit_logging=true in the config and I would see appropriate log messages.
What could it be? What else could I check?

When you say “Later I can authenticate” are you talking seconds? minutes? hours?

Could you be running into something like this?

Vault Eventual Consistency – HashiCorp Help Center.

Hi, thank you for reply.
Usually it takes several minutes. Maybe 5, 20 or 30

Vault eventual consistency - is an enterprise feature. I use Community Edition installation and don’t use performance standbys. Moreover my vault cluster is deployed in kubernetes cluster. And it has active-service which is always look at active node. Because of this I don’t know what could it be…

That is quite a long time - was hoping since you mentioned it was automated it was only a few seconds before it would be usable.

Some general questions:

  • Is this a new issue (e.g. just started) or has it been happening for a long time?
  • Was there a time when this did not happen? Any changes around that time?
  • Have you reviewed known issues against the version of Vault you are running?
  • Is it the same client(s) that tend to have problems or is it random? Maybe some client or node does not have NTP set up so there is an offset on timestamps in the request (e.g. client is 5m behind Vault)?

Unless someone else in this forum has other ideas, I think posting as a GitHub issue in the Vault repo might get more eyes on the problem you’re facing.

  1. This is a new issue. It started maybe two weeks ago.
  2. I didn’t notice this at the beginning of using service.
  3. Actually I didn’t :frowning:
  4. I recommended to my colleague to use ansible to make api request for his automation tasks. One day he told me, task failed. At the same time I checked by cli vault write auth/approle/login... and I got a 403 error too. After a few minutes it started to work fine. I checked logs, but other than 403 error I found nothing.

Ensure that the approle in question isn’t being locked due to too many bad auth attempts.

Reference:

1 Like

@alex.blume Thanks for reply. I think you are right.

The user lockout feature is enabled by default.

I missed this fact. One of a role_id has many secret_id. One of secret_id is wrong. Because of it entier approle is blocked for a period of time. Many thanks!