Mixed consul agent versions

Hello folks,

We’re running a consul-server cluster using v1.12.2 for the servers
The consul-agents run a mix of v1.16.2 and v1.12.2 for whatever reason (working on getting traction to standardise this to something current, potentially 1.18)

However, I’m experiencing weird problems where from time to time after a service restart that consul is watching and reporting back to the server cluster on, cpu and memory on the agent spike, and stay in an elevated state until the consul-agent is restarted.

Also, network throughput scales up to around 5-6MB/s even though there is not much happening, normal operations only show at most a couple of hundred kb/s

Is this to be expected when running mixed versions of agents ?


The spikes in CPU and memory usage you’re seeing, plus that unexpected jump in network traffic, definitely sound odd and aren’t what you’d usually expect, even when you’re mixing different Consul versions. Mixing versions can throw in some curveballs and might be messing things up a bit, contributing to the weird behavior you’re noticing.
I recommend checking out Go pprof $go tool pprof to dig into consul’s performance issues like CPU and memory consumption, the profiles file can be captured from command consul debug.