Consul helm chart fails in a Tanzu kubernetes 1.17 env

I’m following the tutorial at https://learn.hashicorp.com/tutorials/consul/kubernetes-custom-resource-definitions?in=consul/kubecon-2020 to install the hashicorp/consul helm chart ver consul-0.26.0.

All the workload pods are stuck in Init:CrashLoopBackOff. The consul-connect-inject-init container is logging message “Error registering service: Put “http://10.115.3.5:8500/v1/agent/service/register”: dial tcp 10.115.3.5:8500: connect: connection refused.”

10.115.3.5 is the the IP of the k8s cluster host node.

Some context data:

kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
data-default-consul-server-0 Bound pvc-a83f3df6-c1fc-4df0-84f8-6647c69feb9b 10Gi RWO thindisk 127m

kubectl get pods
NAME READY STATUS RESTARTS AGE
consul-2cx4k 1/1 Running 0 48m
consul-connect-injector-webhook-deployment-6ddc4cfc85-s8ljj 1/1 Running 0 48m
consul-controller-5d887d5bf-qfn5t 1/1 Running 0 48m
consul-server-0 1/1 Running 0 48m
consul-webhook-cert-manager-5d588db7bb-9mldc 1/1 Running 0 48m
frontend-679ff56c5c-xr4d9 0/3 Init:CrashLoopBackOff 12 41m
postgres-79d6f8d464-ng8qb 0/3 Init:CrashLoopBackOff 12 41m
product-api-77894c9b87-5jsvx 0/3 Init:CrashLoopBackOff 12 41m
public-api-547b5cc97f-sfqrf 0/3 Init:CrashLoopBackOff 12 41m
util-nosidecar-788df87b75-s6l2p 1/1 Running 0 60m

kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
consul-connect-injector-svc ClusterIP 10.100.200.3 443/TCP 60m
consul-controller-webhook ClusterIP 10.100.200.211 443/TCP 60m
consul-dns ClusterIP 10.100.200.43 53/TCP,53/UDP 60m
consul-server ClusterIP None 8500/TCP,8301/TCP,8301/UDP,8302/TCP,8302/UDP,8300/TCP,8600/TCP,8600/UDP 60m
consul-ui ClusterIP 10.100.200.142 80/TCP 60m
frontend ClusterIP 10.100.200.11 80/TCP 53m
kubernetes ClusterIP 10.100.200.1 443/TCP 3d
postgres ClusterIP 10.100.200.241 5432/TCP 53m
product-api ClusterIP 10.100.200.191 9090/TCP 53m
public-api ClusterIP 10.100.200.144 8080/TCP 53m

values file:

cat /home/dillon/k8s-yaml/helm/consul/helm-consul-values-minimal.v3.kubecon.yaml
global:
name: consul
datacenter: dc1

override the chart image with the latest 1.9 consul image

this is required since service intentions are a new 1.9.0

beta feature

image: consul:1.9.0-beta1

server:

use 1 server

replicas: 1
bootstrapExpect: 1
disruptionBudget:
enabled: true
maxUnavailable: 0

connectInject:
enabled: true

inject an envoy sidecar into every new pod,

except for those with annotations that prevent injection

default: true

enable CRDs

controller:
enabled: true

This is a single node k8s cluster.

Regards,
Paul

Hi Paul,

It may be likely that the single node Kubernetes server commenting out the following flag will help you get this running:

  # Affinity Settings
  # Commenting out or setting as empty the affinity variable, will allow
  # deployment to single node services such as Minikube
  affinity: |
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchLabels:
              app: {{ template "consul.name" . }}
              release: "{{ .Release.Name }}"
              component: server
          topologyKey: kubernetes.io/hostname
1 Like

Thanks for your reply David-yu.

That setting isn’t in the values file I am using. Note than the tutorial asks that we override the charts default values.yaml with a simpler one that does not have the affinity settings:

global:
name: consul
datacenter: dc1

override the chart image with the latest 1.9 consul image

this is required since service intentions are a new 1.9.0

beta feature

image: consul:1.9.0-beta1
server:

use 1 server

replicas: 1
bootstrapExpect: 1
disruptionBudget:
enabled: true
maxUnavailable: 0
connectInject:
enabled: true

inject an envoy sidecar into every new pod,

except for those with annotations that prevent injection

default: true

enable CRDs

controller:
enabled: true

I guess I dont understand why all the workload init containers (consul-connect-inject-init) are trying to connect to the node IP on port 8500. Seems like they should be trying to connect on 8500 to the server service, which is in DNS - the client pod is finding that DNS entry during it’s start up.