Hi team,
I’m using consul (community helm chart version 0.49.2) for service discovery and KV configuration on EKS v1.27 (3 nodes) with 1 server replica.
The requirement is to upgrade the replica count to 3 to ensure high availability(HA) of consul server.
But with the following configuration, the consul servers and clients are in non-ready state
values.yaml
values.txt (1.1 KB)
kubectl get pods -n consul
NAME READY STATUS RESTARTS AGE
consul-client-xxxxx 0/1 Running 0 13m
consul-client-yyyyy 0/1 Running 0 13m
consul-client-zzzzz 0/1 Running 0 13m
consul-connect-injector-5b86xxxx-7mk9x 1/1 Running 0 13m
consul-connect-injector-5b86xxxx-wft8z 1/1 Running 0 13m
consul-controller-d77bf9xxx-l9zsg 1/1 Running 0 13m
consul-server-0 0/1 Running 0 13m
consul-server-1 0/1 Running 0 13m
consul-server-2 0/1 Running 0 13m
consul-webhook-cert-manager-6cb69bbbbb-4pj4r 1/1 Running 0 13m
The logs of consul servers and clients
consul-client logs
consul-client.txt (7.5 KB)
consul-server logs
note: consul-server-0 and 1 have similar logs
consul-server-1.txt (1.0 KB)
consul-server-2.txt (7.7 KB)
I have tried adding various other configs (mentioned below) to the values.yaml
enabled dns,
bootstrapExpect: 3
exposeGossipAndRPCPorts: true (for server and client)
hostNetwork: true (client)
dnsPolicy: ClusterFirstWithHostNet (client)
Updated existing consul single replica setup and also tested with new consul deployment on new target but the issue still persists
One thing I observed is only one of the 3 servers (consul-server-2) will have required ports open in the container and rest of the two just have 8300. Probably because of this we get connection refused as in logs
inside consul-server-0 and consul-server-1 containers:
kubectl exec -it consul-server-1 -n consul -- sh
/ $ netstat -tulpn
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 :::8300 :::* LISTEN 11/consul
/ $ consul members
Node Address Status Type Build Protocol DC Partition Segment
consul-server-2 xx.yy.zzz.225:8301 alive server 1.13.4 2 dc1 default <all>
ip-xx-yy-zzz-32.eu-central-1.compute.internal xx.yy.zzz.203:8301 alive client 1.13.4 2 dc1 default <default>
ip-xx-yy-zzz-74.eu-central-1.compute.internal xx.yy.zzz.212:8301 alive client 1.13.4 2 dc1 default <default>
ip-xx-yy-zzx-4.eu-central-1.compute.internal xx.yy.zzx.240:8301 alive client 1.13.4 2 dc1 default <default>
num_peers = 0
kubectl exec -it consul-server-2 -n consul -- sh
/ $ netstat -tulpn
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 :::8500 :::* LISTEN 11/consul
tcp 0 0 :::8503 :::* LISTEN 11/consul
tcp 0 0 :::8600 :::* LISTEN 11/consul
tcp 0 0 :::8300 :::* LISTEN 11/consul
tcp 0 0 :::8301 :::* LISTEN 11/consul
tcp 0 0 :::8302 :::* LISTEN 11/consul
udp 0 0 :::8301 :::* 11/consul
udp 0 0 :::8302 :::* 11/consul
udp 0 0 :::8600 :::* 11/consul
/ $ consul members
Node Address Status Type Build Protocol DC Partition Segment
consul-server-2 xx.yy.zzz.225:8301 alive server 1.13.4 2 dc1 default <all>
ip-xx-yy-zzz-32.eu-central-1.compute.internal xx.yy.zzz.203:8301 alive client 1.13.4 2 dc1 default <default>
ip-xx-yy-zzz-74.eu-central-1.compute.internal xx.yy.zzz.212:8301 alive client 1.13.4 2 dc1 default <default>
ip-xx-yy-zzx-4.eu-central-1.compute.internal xx.yy.zzx.240:8301 alive client 1.13.4 2 dc1 default <default>
This setup works perfectly fine with single consul server replica.
Can someone please look into this and guide me through?
Any help would be greatly appreciated . Thanks in advance