Agent.rpcclient.health ACL not found error

Hi folks,

I’m having a bit of an issue with Nomad + Consul in that everything appears to be working fine, but my Consul logs on the (single) client are filling up with the following errors:

[ERROR] agent.rpcclient.health: subscribe call failed: err="rpc error: code = Unknown desc = ACL not found" failure_count=16 key=psm-redis topic=ServiceHealth

and sometimes

Feb 14 03:31:35 vmi1633713 consul[16957]: 2024-02-14T03:31:35.308+0100 [ERROR] agent.http: Request error: method=GET url="/v1/health/service/postgres?index=67798&passing=1&stale=&wait=60000ms" from=127.0.0.1:46172 error="rpc error: code = Unknown desc = ACL not found"

The server logs are totally clean.

I ran through the steps from Troubleshoot Consul ACL issues – HashiCorp Help Center and checked client’s token /opt/consul/acl-tokens.json is valid - I set it to my own CONSUL_HTTP_TOKEN environment variable and I can do consul catalog services and get my services back, which shows the Consul server recognises it as a valid token.

What’s interesting is that if I restart consul via systemctl restart consul, the errors go away. Until I run a Nomad job. At which point, these errors appear with a topic= corresponding to each service I used discovery with in the Nomad job. I’m using the JWT mechanism outlined in Consul ACL with Nomad Workload Identities | Nomad | HashiCorp Developer to authenticate my Nomad workloads against Consul.

At this point, I’m pretty stuck - it doesn’t seem to be causing any issues, and all my services report healthy in Consul and can be discovered from other jobs. But I don’t really want to start relying on the cluster while it is spamming errors :confused:

Does anyone know how I might be able to get the token that is supposedly not found and work out where it is coming from? My only clue as to what is sending these requests is the from=127.0.0.1:46172 in one of the log lines, but that port number doesn’t correspond to any running service.

I think the error obviously complains about incorrect ACL token on the agent.
Instead of setting env environment, you can configure the agent to present the token by specifying the token in the agent configure file

acl = {
  enabled = true
  tokens = {
    agent = "<token>"
  }
}

or using command

consul acl set-agent-token agent <token>
1 Like