All workload pods stuck in CrashLoopBackoff after installing the hashicorp/consul chart in TKGi based kubernetes

This is a single node/host cluster. Why are the consul-connect-inject-init containers in workload pods trying to register to $HOST_IP:8500 ? They are getting “Connection refused” and all are stuck in Init:CrashLoopBackOff. The consul-connect-inject-init container is logging message “Error registering service: Put “”: dial tcp connect: connection refused.” is the the IP of the k8s cluster host node.
All helm chart resources are installed and running.


Did you use the following annotation for the servers’ bootstrap? If not , default value is 3 replicas for the daemonset and the bootstrapexpect is 3 so the raft cannot be created since your last two pods will be pending :

  replicas: 2
  connect: true
    enabled : true
  bootstrapExpect: 2

Hope it helps

Hi @drfooser,

The consul-connect-inject-init container is trying to connect to the local client agent so that it can register itself with Consul. See the architecture section of Installing Consul on Kubernetes for more info about the client and server agents which are deployed.

Can you check the Consul client logs with the following command to see if there are errors, or any other indication as to why it may be not listening on port 8500?

kubectl logs --selector="app=consul,component=client"
1 Like

Blake, thanks for your help. See my notes below,

After helm install all pods are running - client logs state…

dillon FMY >kubectl logs --selector=“app=consul,component=client”

2020-12-01T20:09:53.330Z [WARN] agent.router.manager: No servers available

2020-12-01T20:09:53.330Z [ERROR] agent.http: Request error: method=GET url=/v1/status/leader from= error=“No known Consul servers”

2020-12-01T20:09:54.390Z [WARN] agent.router.manager: No servers available

2020-12-01T20:09:54.390Z [ERROR] agent.anti_entropy: failed to sync remote state: error=“No known Consul servers”

2020-12-01T20:09:58.806Z [INFO] agent: (LAN) joining: lan_addresses=[consul-server-0.consul-server.default.svc]

2020-12-01T20:09:58.820Z [INFO] agent.client.serf.lan: serf: EventMemberJoin: consul-server-0

2020-12-01T20:09:58.820Z [INFO] agent: (LAN) joined: number_of_nodes=1

2020-12-01T20:09:58.820Z [INFO] agent: Join cluster completed. Synced with initial agents: cluster=LAN num_agents=1

2020-12-01T20:09:58.820Z [INFO] agent.client: adding server: server=“consul-server-0 (Addr: tcp/ (DC: dc1)”

2020-12-01T20:10:00.863Z [INFO] agent: Synced node info

Then I install a simple utility deployment …

dillon FMY >k apply -f ~/k8s-yaml/util-deployment-sidecar.yaml
deployment.apps/util-sidecar created

Deployment is failing at init

dillon FMY >k get pods


consul-676vm 1/1 Running 0 32m
consul-connect-injector-webhook-deployment-6ddc4cfc85-qztf4 1/1 Running 0 32m
consul-controller-5d887d5bf-6ml9b 1/1 Running 0 32m
consul-server-0 1/1 Running 0 32m
consul-webhook-cert-manager-5d588db7bb-jz7lv 1/1 Running 0 32m
util-nosidecar-788df87b75-s6l2p 1/1 Running 0 8d
util-sidecar-5f98688568-7jjb6 0/3 Init:Error 1 10s

consul connect inject init container logs state…

dillon FMY >k logs util-sidecar-5f98688568-7jjb6 -c consul-connect-inject-init
Error registering service “util”: Put “”: dial tcp connect: connection refused

All other container logs state…

dillon FMY >k logs util-sidecar-5f98688568-7jjb6 -c consul-connect-lifecycle-sidecar
Error from server (BadRequest): container “consul-connect-lifecycle-sidecar” in pod “util-sidecar-5f98688568-7jjb6” is waiting to start: PodInitializing
dillon FMY >k logs util-sidecar-5f98688568-7jjb6 -c consul-connect-envoy-sidecar
Error from server (BadRequest): container “consul-connect-envoy-sidecar” in pod “util-sidecar-5f98688568-7jjb6” is waiting to start: PodInitializing
dillon FMY >k logs util-sidecar-5f98688568-7jjb6 -c util
Error from server (BadRequest): container “util” in pod “util-sidecar-5f98688568-7jjb6” is waiting to start: PodInitializing

Installed services…

dillon FMY >k get services -A
default consul-connect-injector-svc ClusterIP 443/TCP 41m
default consul-controller-webhook ClusterIP 443/TCP 41m
default consul-dns ClusterIP 53/TCP,53/UDP 41m
default consul-server ClusterIP None 8500/TCP,8301/TCP,8301/UDP,8302/TCP,8302/UDP,8300/TCP,8600/TCP,8600/UDP 41m
default consul-ui ClusterIP 80/TCP 41m
default frontend ClusterIP 80/TCP 7d22h
default kubernetes ClusterIP 443/TCP 11d
default postgres ClusterIP 5432/TCP 7d22h
default product-api ClusterIP 9090/TCP 7d22h
default public-api ClusterIP 8080/TCP 7d22h
kube-system kube-dns ClusterIP 53/UDP,53/TCP 11d
kube-system metrics-server ClusterIP 443/TCP 11d
kube-system tiller-deploy ClusterIP 44134/TCP 11d
pks-system fluent-bit ClusterIP 24224/TCP 11d
pks-system node-exporter ClusterIP 10536/TCP 11d
pks-system validator ClusterIP 443/TCP 11d
dillon FMY >

I’m stuck!


The Consul client (which runs as a DaemonSet and uses hostPort) does not appear to be reachable from node where pod is running.

Can you verify this pod is listening on port 8500, and that it is responding to HTTP requests issued from within the pod?

Thanks Blake,
This reveals the problem. We are using NSX-T which does not support hostPort. It does support nodePort.

I am thinking that I can try creating a nodePort service for the clients daemonset and point the init containers to that.
Can you share your thoughts about that?
Do you have a better suggestion?

Hello Blake,

As I stated, NSX-T does not support hostPort. It does support nodePort.

I can build a nodePort service to expose a port on the node and proxy traffic to the clients, but the default port range for nodePort specified by the --service-node-port-range flag at the K8s control plane is 30000-32767 and changing that to include 8500 might not be permitted. So instead I’d like to change the port that the consul-connect-inject-init container sends to register with the client agent.

Looking through the helm chart templates, I’m having a hard time understanding how to change that port for the init containers.

Can you help?

Regards, and thanks for your assistance.



I finally settled on the nodeNetwork configuration option. That exposed 8500 on the nodes IP.
Including 8500 in the service-node-port-range seemed like a stretch. So we abandon the idea of nodePort.


Glad to hear it! NodePort wouldn’t work because the request could go to any consul client but instead it must go to the consul client on that node.

Well I’d like your opinion on a work-around for that - because I’m getting trouble from security about using hostNetwork.

If I create a nodePort service for each individual client selector targeting only one client, seems like I can theoretically link all proxys on node A to the client on node A through the nodePort service with labels and selector fields set to node A.


Hi @drfooser I was looking at some of the NSX-T CNI release notes and it looks like if you are on Ubuntu, enabling HostPort is an option for you on NSX-T. However this issue is still outstanding on RHEL/CentOS, are you also on RHEL/CentOS by chance?

Issue 2697547: HostPort not supported on RHEL/CentOS/RHCOS nodes
You can specify hostPorts on native Kubernetes and PKS on Ubuntu nodes by setting ‘enable_hostport_snat’ to True in nsx-node-agent ConfigMap. However, on RHEL/CentOS/RHCOS nodes hostPort is not supported and the parameter ‘enable_hostport_snat’ is ignored.
Workaround: None