Consul benchmarking : EOF errors

amit-handda · February 12, 2020, 8:08pm

I am getting errors while benchmarking consul PUT/GET requests.
[268] Get http://10.99.0.241:6555/v1/kv/bench: EOF
I am using hey to benchmark consul.
consul is deployed as a single instance server on 8 core, 16GB aws ec2 instance (c4.2xlarge)
the network/os params are tuned (ulimit -n 100000, somaxconn -> 100000).
benchmarking command line: ./heyl -n 10000 -c 100 -m PUT -d 1234 http://<ip>:6555/v1/kv/bench
n -> total operations
c -> concurrency

error stats at various n,c are as follows:

Concurrency | Ops | Errors (apprx.)
----------- | ---- | --------------------
100 | 10000 | no errors
200 | 10000 | PUT(250 EOF), GET(13 EOF)
1000 | 10000 | PUT(1300 EOF), GET(465 EOF)

any ideas on possible cause ?

amit-handda · February 18, 2020, 4:46pm

bumping again … any pointers ?

amit-handda · March 4, 2020, 5:08pm

for someone who lands here, do following to tune the cluster:
tune limits for benchmarking so that you dont get connection resets at >100 concurrency

    "http_max_conns_per_client" : 1000,
    "rpc_max_conns_per_client" : 1000
  },

Also, increase socket receive buffer size otherwise benchmark client ll receive

i/o timeouts

net.core.rmem_default=851968
net.core.wmem_default=851968

jsosulska · March 9, 2020, 8:48pm

Hi @amit-handda!

Thanks for posting here, and for the update. I apologize that no one got back to you.
I am curious about your use case here. What traffic patterns are you trying to emulate?
I noticed that you mentioned consul was only running on a single C4.2XL instance. Can I ask why you’re running only a single node?

Thanks for the good question, and the follow up!

amit-handda · March 9, 2020, 9:06pm

Thanks @jsosulska for replying. I was wondering if ppl visit the forum or not.

as already noted in my original post, I am benchmarking consul agent.
I was running consul on a single node (single server) because I wanted to observe the numbers on a simpler configuration, before setting up a 3 node server cluster + agents configurations …

Thanks,

jsosulska · March 11, 2020, 12:29am

Hi @amit-handda

Thanks for the fast response! It helps a lot. Two quick questions;

What version of Consul are you benchmarking?
Have you seen our server performance guide?

A word of caution for your benchmarking numbers - in our guides, we do post warnings that a single instance deployment of Consul isn’t the same as running a full deployment. When a Consul agent is ran in -dev mode, the binary is handling both server and agent functionalities. Here’s a brief example;

You can’t replicate raft behaviors in -dev mode. Raft is often the bottleneck in multi-server setups since all writes have to be handled by a majority subject to network latency etc.
-dev mode is especially bad since it is in-memory only and doesn’t even bother writing the raft log to disk which means disk IO isn’t reflected - the write bottleneck in a real install.

For some additional considerations, please see our Consul internals.

Looking forward to hearing back from you.

Topic		Replies	Views
Your IP is issuing too many concurrent connections, please rate limit your calls Consul	5	11962	August 28, 2022
Consul provider tcp reset Consul	1	447	September 22, 2020
Why My bench result was so strange， rps too low Consul	0	489	March 17, 2022
Upper limit for consul KV watches Consul	4	1420	December 31, 2021
Consul cache OOM Consul	2	769	March 9, 2020

Consul benchmarking : EOF errors

Related topics