Consul Connect jobs unable to talk to sidecar

Nomad version: 0.10.3
Consul version: 1.6.3

I’m attempting to deploy count-dash, a consul connect as shown in the documentation.

job "countdash" {
  datacenters = ["dc1"]

  group "api" {
    network {
      mode = "bridge"
    }

    service {
      name = "count-api"
      port = "9001"

      connect {
        sidecar_service {}
      }
    }

    task "web" {
      driver = "docker"

      config {
        image = "hashicorpnomad/counter-api:v1"
      }
    }
  }

  group "dashboard" {
    network {
      mode = "bridge"

      port "http" {
        static = 9002
        to     = 9002
      }
    }

    service {
      name = "count-dashboard"
      port = "9002"

      connect {
        sidecar_service {
          proxy {
            upstreams {
              destination_name = "count-api"
              local_bind_port  = 8080
            }
          }
        }
      }
    }

    task "dashboard" {
      driver = "docker"

      env {
        COUNTING_SERVICE_URL = "http://${NOMAD_UPSTREAM_ADDR_count_api}"
      }

      config {
        image = "hashicorpnomad/counter-dashboard:v1"
      }
    }
  }
}

I find that the service will not register in consul.

In the logs I see

Feb 20 18:25:20 nomadagent1 consul[9138]:     2020/02/20 18:25:20 [INFO] agent: Synced service "_nomad-task-ead8e20d-96ae-3878-39c3-60c6e86238ee-group-api-count-api-9001"
Feb 20 18:25:21 nomadagent1 consul[9138]:     2020/02/20 18:25:21 [INFO] agent: Synced service "_nomad-task-efcf7cd7-305b-07aa-c68b-081821b5eed5-group-dashboard-count-dashboard-9002"
Feb 20 18:25:21 nomadagent1 consul[9138]:     2020/02/20 18:25:21 [INFO] agent: Synced service "_nomad-task-efcf7cd7-305b-07aa-c68b-081821b5eed5-group-dashboard-count-dashboard-9002-sidecar-proxy"
Feb 20 18:25:25 nomadagent1 consul[9138]:     2020/02/20 18:25:25 [WARN] agent: Check "service:_nomad-task-ead8e20d-96ae-3878-39c3-60c6e86238ee-group-api-count-api-9001-sidecar-proxy:1" socket connection failed: dial tcp 127.0.0.1:22354: connect: connection refused
Feb 20 18:25:29 nomadagent1 consul[9138]:     2020/02/20 18:25:29 [WARN] agent: Check "service:_nomad-task-efcf7cd7-305b-07aa-c68b-081821b5eed5-group-dashboard-count-dashboard-9002-sidecar-proxy:1" socket connection failed: dial tcp 127.0.0.1:25732: connect: connection refused
Feb 20 18:25:35 nomadagent1 consul[9138]:     2020/02/20 18:25:35 [WARN] agent: Check "service:_nomad-task-ead8e20d-96ae-3878-39c3-60c6e86238ee-group-api-count-api-9001-sidecar-proxy:1" socket connection failed: dial tcp 127.0.0.1:22354: connect: connection refused
Feb 20 18:25:39 nomadagent1 consul[9138]:     2020/02/20 18:25:39 [WARN] agent: Check "service:_nomad-task-efcf7cd7-305b-07aa-c68b-081821b5eed5-group-dashboard-count-dashboard-9002-sidecar-proxy:1" socket connection failed: dial tcp 127.0.0.1:25732: connect: connection refused
Feb 20 18:25:45 nomadagent1 consul[9138]:     2020/02/20 18:25:45 [WARN] agent: Check "service:_nomad-task-ead8e20d-96ae-3878-39c3-60c6e86238ee-group-api-count-api-9001-sidecar-proxy:1" socket connection failed: dial tcp 127.0.0.1:22354: connect: connection refused
Feb 20 18:25:49 nomadagent1 consul[9138]:     2020/02/20 18:25:49 [WARN] agent: Check "service:_nomad-task-efcf7cd7-305b-07aa-c68b-081821b5eed5-group-dashboard-count-dashboard-9002-sidecar-proxy:1" socket connection failed: dial tcp 127.0.0.1:25732: connect: connection refused
Feb 20 18:25:55 nomadagent1 consul[9138]:     2020/02/20 18:25:55 [WARN] agent: Check "service:_nomad-task-ead8e20d-96ae-3878-39c3-60c6e86238ee-group-api-count-api-9001-sidecar-proxy:1" socket connection failed: dial tcp 127.0.0.1:22354: connect: connection refused
Feb 20 18:25:59 nomadagent1 consul[9138]:     2020/02/20 18:25:59 [WARN] agent: Check "service:_nomad-task-efcf7cd7-305b-07aa-c68b-081821b5eed5-group-dashboard-count-dashboard-9002-sidecar-proxy:1" socket connection failed: dial tcp 127.0.0.1:25732: connect: connection refused

Looking at docker ps, I see the container running:

docker ps
CONTAINER ID        IMAGE                                      COMMAND                  CREATED             STATUS              PORTS                                                    NAMES
f26f5de4e549        envoyproxy/envoy:v1.11.2                   "/docker-entrypoint.…"   36 hours ago        Up 36 hours                                                                  connect-proxy-count-api-70b0d090-1b34-53f9-79de-84254d1605ab
9bdd224c36eb        hashicorpnomad/counter-api:v1              "./counting-service"     36 hours ago        Up 36 hours                                                                  web-70b0d090-1b34-53f9-79de-84254d1605ab
f85166cea751        envoyproxy/envoy:v1.11.2                   "/docker-entrypoint.…"   36 hours ago        Up 36 hours                                                                  connect-proxy-count-dashboard-42ab2b9f-1c09-6a09-4520-6daf861fb643
6e810ca277a8        hashicorpnomad/counter-dashboard:v1        "./dashboard-service"    36 hours ago        Up 36 hours                                                                  dashboard-42ab2b9f-1c09-6a09-4520-6daf861fb643

Looking at netstat, there is nothing listing on port 24140

netstat -plnt
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 10.47.70.2:8301         0.0.0.0:*               LISTEN      9138/consul
tcp        0      0 10.47.70.2:23504        0.0.0.0:*               LISTEN      26900/docker-proxy
tcp        0      0 10.47.70.2:27539        0.0.0.0:*               LISTEN      26771/docker-proxy
tcp        0      0 127.0.0.1:8500          0.0.0.0:*               LISTEN      9138/consul
tcp        0      0 127.0.0.1:8502          0.0.0.0:*               LISTEN      9138/consul
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      9819/sshd
tcp        0      0 127.0.0.1:8600          0.0.0.0:*               LISTEN      9138/consul
tcp        0      0 0.0.0.0:25              0.0.0.0:*               LISTEN      26971/master
tcp        0      0 0.0.0.0:5665            0.0.0.0:*               LISTEN      13437/icinga2
tcp6       0      0 :::4646                 :::*                    LISTEN      13092/nomad
tcp6       0      0 :::9998                 :::*                    LISTEN      6063/fabio
tcp6       0      0 :::9999                 :::*                    LISTEN      6063/fabio
tcp6       0      0 :::25                   :::*                    LISTEN      26971/master

I’ve ensured that the cni plugins are installed, and that the OS is configured to allow container traffic through bridge networks.

cat /proc/sys/net/bridge/bridge-nf-call-arptables
1
cat /proc/sys/net/bridge/bridge-nf-call-ip6tables
1
cat /proc/sys/net/bridge/bridge-nf-call-iptables
1

What could be the reason that consul connect isn’t working? Do I need to disable ACLs?

Temporary workaround is to disable acl’s on just the nomad worker agents. (No need to disable acl’s in the entire datacenter).

Whats the correct way to run this job without disabling ACLs?

0.10.4 just shipped with support for Connect with Consul ACLs, so that should fix the issue for you.

I have this problem, but no ACLs.

I’ve been trying to figure out how to debug it. Seeing that it’s posted in a couple threads. Connect sidecar listening healthcheck fail - #5 by david is the first one I found, and I posted there.