We deployed consul 1.15 on Openshift 4.12.
3-server3 client deployment.
we see below error i/o timeout in server logs and consul-server.consul.svc.cluster.local says connection refused.
But the UI is up , able to see the services getting registered properly.
However the springboot API’s deployed were unable to connect to consul-server. Consul injection side car proxy is success and able to get the token logging in with ACL Role.
Server error logs :
2023-07-13T01:27:55.385Z [TRACE] agent.tlsutil: IncomingRPCConfig: version=5
2023-07-13T01:27:55.385Z [TRACE] agent.tlsutil: IncomingRPCConfig: version=5
2023-07-13T01:27:55.785Z [TRACE] agent.server: rpc_server_call: method=Status.RaftStats errored=false request_type=read rpc_type=net/rpc leader=false
2023-07-13T01:27:56.085Z [TRACE] agent.server: rpc_server_call: method=Status.RaftStats errored=false request_type=read rpc_type=net/rpc leader=false
2023-07-13T01:27:56.751Z [ERROR] agent.server.rpc: failed to read first byte: conn=from=100.64.0.3:52531 error=“read tcp 10.128.10.219:8300->100.64.0.3:52531: i/o timeout”
2023-07-13T01:27:56.883Z [DEBUG] agent.server.memberlist.lan: memberlist: Stream connection from=10.128.7.102:53588
2023-07-13T01:27:56.884Z [TRACE] agent.server.usage_metrics: Starting usage run
2023-07-13T01:27:57.184Z [TRACE] agent.tlsutil: IncomingHTTPSConfig: version=5
2023-07-13T01:27:57.186Z [TRACE] agent.server: rpc_server_call: method=Status.RaftStats errored=false request_type=read rpc_type=net/rpc leader=false
2023-07-13T01:27:57.186Z [TRACE] agent.server: rpc_server_call: method=Status.Leader errored=false request_type=read rpc_type=net/rpc leader=false allow_stale=false blocking=false target_datacenter=dc1 locality=local
2023-07-13T01:27:57.186Z [DEBUG] agent.http: Request finished: method=GET url=/v1/status/leader from=127.0.0.1:57460 latency=“128.4µs”
2023-07-13T01:27:57.666Z [DEBUG] agent.server.memberlist.lan: memberlist: Stream connection from=100.64.0.2:65150
2023-07-13T01:27:57.670Z [ERROR] agent.server.rpc: failed to read first byte: conn=from=100.64.0.2:65151 error=“read tcp 10.128.10.219:8300->100.64.0.2:65151: i/o timeout”
Tried re-starting the consul-server , client in rolling update way. But still couldnt fix the issue. Leader is elected and all agent status shows alive and healthy set to true.