Consul agent CPU Utilization jumping to its limits in 5 mins with bare minimum traffic and not getting recovered


Consul agent CPU Utilization jumping to its limits in 5 mins with bare minimum traffic and not getting recovered

Problem Description:

We are using Consul for service discovery.
Our consul servers with 3 Replica set, each pod is having consul agent as a separate container.
Service discovery is done through those consul agent containers.
Sporadically, we are observing consul agent containers reaching their CPU Limits which are defined as 200m in 5 mins with bare minimum traffic as :

Monitor Logs
2023-04-27T13:44:47.264Z [DEBUG] agent.client.memberlist.lan: memberlist: Stream connection
2023-04-27T13:44:36.329Z [DEBUG] agent: Check status updated: check=service:mtb123-LocalD-mtb123-scscf-8488d6b876-xkdjl status=passing
2023-04-27T13:44:39.099Z [DEBUG] agent: Check status updated: check=service:mtb123-RtpRestServ-mtb123-scscf-8488d6b876-xkdjl status=passing
2023-04-27T13:44:39.624Z [DEBUG] agent.http: Request finished: method=GET url=/v1/status/leader from= latency=365.303µs
2023-04-27T13:42:33.121Z [DEBUG] agent: Service in sync: service=mtb123-LocalD-mtb123-scscf-8488d6b876-xkdjl
2023-04-27T13:42:33.121Z [DEBUG] agent: Service in sync: service=mtb123-RtpRestServ-mtb123-scscf-8488d6b876-xkdjl
2023-04-27T13:42:33.121Z [DEBUG] agent: Check in sync: check=service:mtb123-RtpRestServ-mtb123-scscf-8488d6b876-xkdjl

Most of the time the CPU remains less than 30m - 40m.

Once it reaches 200m then it is not coming back to its original state.

After having the top command
wnv4a0cscf0001c-scscf-66785f456f-vqgwj consul-agent 201m 69Mi

Consul version : 1.8.0

Consul Helm chart template:


  **cpu:     200m**
  memory:  200Mi
  cpu:      100m
  memory:   200Mi

**Readiness:**  exec [/bin/sh -ec curl --connect-timeout 5 --max-time 5 http://localhost:8500/v1/status/leader \

2>/dev/null grep -E ‘“.+”’
] delay=0s timeout=5s period=10s #success=1 #failure=3


Type Reason Age From Message

Warning Unhealthy 23s (x19473 over 34d) kubelet, wnv4a-c00-perfworkerbm-22 Readiness probe failed:


[wnv4a0cscf0001c-user@wnv4b0depl0001vm001 ~] [wnv4a0cscf0001c-user@wnv4b0depl0001vm001 ~]
[wnv4a0cscf0001c-user@wnv4b0depl0001vm001 ~]$ kc logs wnv4a0cscf0001c-scscf-66785f456f-vqgwj -c consul-agent
/opt/startups/ line 7: /logstore/wnv4a0cscf0001c-scscf-66785f456f-vqgwj/consul-agent/startlog.txt: No such file or directory

Consul in Client Mode
Cannot create /tmp/consul-test. Reverting to /tmp
==> Starting Consul agent…
Version: ‘v1.0.6.xxxx-23-g6e7b310’
Node ID: ‘ffb41e1d-6a01-xxxxx-32d3-35864ca780d6’
Node name: ‘wnv4a0cscf0001c-scscf-66785f456f-vqgwj’
Datacenter: ‘wnv4a0cscf0001c’ (Segment: ‘’)
Server: false (Bootstrap: false)
Client Addr: [ xxx.xx.xx.229] (HTTP: 8500, HTTPS: -1, gRPC: -1, DNS: -1)
Cluster Addr: xx.xx.xx.229 (LAN: 8301, WAN: 8302)
Encrypt: Gossip: true, TLS-Outgoing: false, TLS-Incoming: false, Auto-Encrypt-TLS: false

==> Log data will now stream in as it occurs:

==> Consul agent running!
2023-03-21T18:11:41.829Z [ERROR] agent.anti_entropy: failed to sync remote state: error=“No known Consul servers”
2023-03-21T18:11:48.429Z [ERROR] agent: Failed to check for updates: error=“Get “”: dial tcp i/o timeout (Client.Timeout exceeded while awaiting headers)”
2023-03-21T18:38:24.791Z [ERROR] agent.client: RPC failed to server: method=Status.Leader server=xxx.xx.xx.123:8300 error=“rpc error making call: stream closed”
2023-03-21T18:38:33.012Z [ERROR] agent.client.memberlist.lan: memberlist: Conflicting address for wnv4a0cscf0001c-sd-0. Mine: xx.xx.xxxxx.123:8301 Theirs: Old state: 2
2023-03-22T18:39:00.607Z [ERROR] agent: Failed to check for updates: error=“Get “”: context deadline exceeded (Client.Timeout exceeded while awaiting headers)”

[wnv4a0cscf0001c-user@wnv4b0depl0001vm001 ~] [wnv4a0cscf0001c-user@wnv4b0depl0001vm001 ~]

Victoria matrix graph :

Debug Error:

Please Help us to Solve this Problem and please describe the cause for it.

**If Any other details are needed, please let us know **