Agent.rpcclient.health ACL not found error

jay0 · February 14, 2024, 2:35am

Hi folks,

I’m having a bit of an issue with Nomad + Consul in that everything appears to be working fine, but my Consul logs on the (single) client are filling up with the following errors:

[ERROR] agent.rpcclient.health: subscribe call failed: err="rpc error: code = Unknown desc = ACL not found" failure_count=16 key=psm-redis topic=ServiceHealth

and sometimes

Feb 14 03:31:35 vmi1633713 consul[16957]: 2024-02-14T03:31:35.308+0100 [ERROR] agent.http: Request error: method=GET url="/v1/health/service/postgres?index=67798&passing=1&stale=&wait=60000ms" from=127.0.0.1:46172 error="rpc error: code = Unknown desc = ACL not found"

The server logs are totally clean.

I ran through the steps from Troubleshoot Consul ACL issues – HashiCorp Help Center and checked client’s token /opt/consul/acl-tokens.json is valid - I set it to my own CONSUL_HTTP_TOKEN environment variable and I can do consul catalog services and get my services back, which shows the Consul server recognises it as a valid token.

What’s interesting is that if I restart consul via systemctl restart consul, the errors go away. Until I run a Nomad job. At which point, these errors appear with a topic= corresponding to each service I used discovery with in the Nomad job. I’m using the JWT mechanism outlined in Consul ACL with Nomad Workload Identities | Nomad | HashiCorp Developer to authenticate my Nomad workloads against Consul.

At this point, I’m pretty stuck - it doesn’t seem to be causing any issues, and all my services report healthy in Consul and can be discovered from other jobs. But I don’t really want to start relying on the cluster while it is spamming errors

Does anyone know how I might be able to get the token that is supposedly not found and work out where it is coming from? My only clue as to what is sending these requests is the from=127.0.0.1:46172 in one of the log lines, but that port number doesn’t correspond to any running service.

brucelok · February 26, 2024, 3:48am

I think the error obviously complains about incorrect ACL token on the agent.
Instead of setting env environment, you can configure the agent to present the token by specifying the token in the agent configure file

acl = {
  enabled = true
  tokens = {
    agent = "<token>"
  }
}

or using command

consul acl set-agent-token agent <token>

jay0 · July 31, 2024, 9:29pm

Thanks for the reply - sorry I’m 5 months late responding! I didn’t seem to get a notification, or perhaps I missed the email.

I recently changed my config file to have

acl {
  enabled = true
  default_policy = "deny"
  enable_token_persistence = true
  tokens {
    default = "<token>"
  }
}

(maybe having persistence and a token enabled is redundant, but I have the same value as in <token> in /opt/consul/acl-tokens.json )

I think this should be equivalent to setting the agent token per the docs, as the agent token will fall back on this: Agents - Configuration File Reference | Consul | HashiCorp Developer

Despite all that, I am still getting these errors in the logs whenever I deploy a service. It’s really confusing as I have personally validated that <token> is a valid Consul token and can be used to operate the CLI, so ‘ACL not found’ makes no sense unless the agent is somehow using a different token.

rincler · July 9, 2025, 10:43am

The same issue. It does not work even if {token} is management:
tokens {
default = “{token}”
}

It’s reproduced after deploying. Errors disappear after restarting Consul.

Topic		Replies	Views
“ACL not found” error \| failure in service registration in consul Nomad consul-nomad	0	166	June 20, 2024
Nomad client: error discovering nomad servers using Consul agent, ACL issue? Nomad	0	1589	April 11, 2021
Stuck troubleshooting ACLs Consul	3	755	August 16, 2023
Consul-connect job results in "ACL support disabled" error Nomad acl	3	5040	August 19, 2020
Connect sidecar listening healthcheck fail Nomad	11	4119	October 5, 2021

Agent.rpcclient.health ACL not found error

Related topics