Consul connect with envoy does not work as described in official tutorial

Hi everyone,

I’m new to Consul and I am trying to get familiar with Consul Service Mesh / Connect.

My Setup consists of 6 VM nodes:

  • 3x Server
  • 3x Client

Service registration, API querying, etc. works fine. There is no firewall whats-o-ever blocking any traffic between those nodes.

I am following the official guide: Secure Service Communication with Consul Service Mesh and Envoy | Consul - HashiCorp Learn with all the examples in there.
I just did not install envoy using func-e, as described in that doc, but downloaded the binary for version 1.22.2 from their archive (which is listed as compatible for Consul 1.13.x in Connect - Envoy Integration | Consul by HashiCorp), directly.
My Consul version is 1.13.1.

Following the guide, I end up with the following sepcifications created in Consul:

  • For the counting node:
{
  "counting-1": {
    "ID": "counting-1",
    "Service": "counting",
    "Tags": [],
    "Meta": {},
    "Port": 9003,
    "Address": "",
    "Weights": {
      "Passing": 1,
      "Warning": 1
    },
    "EnableTagOverride": false,
    "Datacenter": "sbsdev"
  },
  "counting-1-sidecar-proxy": {
    "Kind": "connect-proxy",
    "ID": "counting-1-sidecar-proxy",
    "Service": "counting-sidecar-proxy",
    "Tags": [],
    "Meta": {},
    "Port": 21000,
    "Address": "",
    "TaggedAddresses": {
      "consul-virtual": {
        "Address": "240.0.0.3",
        "Port": 21000
      }
    },
    "Weights": {
      "Passing": 1,
      "Warning": 1
    },
    "EnableTagOverride": false,
    "Proxy": {
      "DestinationServiceName": "counting",
      "DestinationServiceID": "counting-1",
      "LocalServiceAddress": "127.0.0.1",
      "LocalServicePort": 9003,
      "MeshGateway": {},
      "Expose": {}
    },
    "Datacenter": "sbsdev"
  }
}
  • For the dashboard node:
{
  "dashboard": {
    "ID": "dashboard",
    "Service": "dashboard",
    "Tags": [],
    "Meta": {},
    "Port": 9002,
    "Address": "",
    "Weights": {
      "Passing": 1,
      "Warning": 1
    },
    "EnableTagOverride": false,
    "Datacenter": "sbsdev"
  },
  "dashboard-sidecar-proxy": {
    "Kind": "connect-proxy",
    "ID": "dashboard-sidecar-proxy",
    "Service": "dashboard-sidecar-proxy",
    "Tags": [],
    "Meta": {},
    "Port": 21000,
    "Address": "",
    "TaggedAddresses": {
      "consul-virtual": {
        "Address": "240.0.0.4",
        "Port": 21000
      }
    },
    "Weights": {
      "Passing": 1,
      "Warning": 1
    },
    "EnableTagOverride": false,
    "Proxy": {
      "DestinationServiceName": "dashboard",
      "DestinationServiceID": "dashboard",
      "LocalServiceAddress": "127.0.0.1",
      "LocalServicePort": 9002,
      "Upstreams": [
        {
          "DestinationType": "service",
          "DestinationName": "counting",
          "LocalBindPort": 5000,
          "MeshGateway": {}
        }
      ],
      "MeshGateway": {},
      "Expose": {}
    },
    "Datacenter": "sbsdev"
  }
}
  • Intention (even though the default is allow):
> consul intention get dashboard counting
Source:       dashboard
Destination:  counting
Action:       allow
Created At:   Monday, 01-Jan-01 00:53:28 LMT
  • ACL (default_policy is allow):
> consul acl policy list
global-management:
   ID:           00000000-0000-0000-0000-000000000001
   Description:  Builtin Policy that grants unlimited access
   Datacenters:
node-policy:
   ID:           68fe7c9e-7242-f166-3f68-d5e3ad5d65e5
   Description:
   Datacenters:

> consul acl policy read -name global-management
ID:           00000000-0000-0000-0000-000000000001
Name:         global-management
Description:  Builtin Policy that grants unlimited access
Datacenters:
Rules:

acl = "write"
agent_prefix "" {
	policy = "write"
}
event_prefix "" {
	policy = "write"
}
key_prefix "" {
	policy = "write"
}
keyring = "write"
node_prefix "" {
	policy = "write"
}
operator = "write"
mesh = "write"
peering = "write"
query_prefix "" {
	policy = "write"
}
service_prefix "" {
	policy = "write"
	intentions = "write"
}
session_prefix "" {
	policy = "write"
}

> consul acl policy read -name node-policy
ID:           68fe7c9e-7242-f166-3f68-d5e3ad5d65e5
Name:         node-policy
Description:
Datacenters:
Rules:
agent_prefix "" {
  policy = "write"
}
node_prefix "" {
  policy = "write"
}
service_prefix "" {
  policy = "read"
}
session_prefix "" {
  policy = "read"
}

Both, the dashboard and the counting example apps are running.

When I start the build-in consul proxy, everything is fine; the UI shows “green” for everything and the dashboard is displaying a raising number:

  • proxy counting
> consul connect proxy -sidecar-for counting-1
==> Consul Connect proxy starting...
    Configuration mode: Agent API
        Sidecar for ID: counting-1
              Proxy ID: counting-1-sidecar-proxy

==> Log data will now stream in as it occurs:

    2022-08-17T17:11:56.493+0200 [INFO]  proxy: Proxy loaded config and ready to serve
    2022-08-17T17:11:56.493+0200 [INFO]  proxy: Parsed TLS identity: uri=spiffe://ed29c670-fc08-7b83-6b48-af1ed1ce3251.consul/ns/default/dc/sbsdev/svc/counting
    2022-08-17T17:11:56.493+0200 [INFO]  proxy: Starting listener: listener="public listener" bind_addr=0.0.0.0:21000
  • proxy dashboard
> consul connect proxy -sidecar-for dashboard
==> Consul Connect proxy starting...
    Configuration mode: Agent API
        Sidecar for ID: dashboard
              Proxy ID: dashboard-sidecar-proxy

==> Log data will now stream in as it occurs:

    2022-08-17T17:11:58.336+0200 [INFO]  proxy: Starting listener: listener=127.0.0.1:5000->service:default/default/counting bind_addr=127.0.0.1:5000
    2022-08-17T17:11:58.337+0200 [INFO]  proxy: Proxy loaded config and ready to serve
    2022-08-17T17:11:58.337+0200 [INFO]  proxy: Parsed TLS identity: uri=spiffe://ed29c670-fc08-7b83-6b48-af1ed1ce3251.consul/ns/default/dc/sbsdev/svc/dashboard
    2022-08-17T17:11:58.338+0200 [INFO]  proxy: Starting listener: listener="public listener" bind_addr=0.0.0.0:21000

When I use the envoy - commandline as described in the guide, it does not work with a lot of output I can’t really interpret, since I’m not familiar with envoy:

  • envoy counting:
> consul connect envoy -sidecar-for counting-1 -admin-bind localhost:19001

Please find the output here: envoy counting output

  • envoy dashboard:
> consul connect envoy -sidecar-for dashboard

Please find the output here: envoy dashboard output

Sorry for these tons of logs, but that’s what I experience when I follow the official guides …

Can anyone tell me what I am doing wrong?

Hi @The-Judge,

From the following logs, it seems to be an issue with the Envoy instance talking to the Connect gRPC Port.

[2022-08-17 17:16:12.223][1364051][warning][config] [source/common/config/grpc_subscription_impl.cc:118] gRPC config: initial fetch timed out for type.googleapis.com/envoy.config.listener.v3.Listener
[2022-08-17 17:16:23.959][1364051][warning][config] [./source/common/config/grpc_stream.h:196] DeltaAggregatedResources gRPC config stream closed since 41s ago: 14, upstream connect error or disconnect/reset before headers. reset reason: connection termination

One reason for the above error is that your agents are running with TLS, but you don’t run the consul connect envoy command over TLS.

From the other topics in this forum, I know that you have enabled HTTPS for your agents and are using auto-encrypt. When you enable HTTPS, the Connect gRPC would also use the same settings for gRPC (unless you override using tls.grpc config that is available from Consul 1.12).

So to get this working, make sure that you set the TLS settings as environment variables or via CLI args.

So in your case, considering you have auto-encrypt, you will have to use the following steps:

  1. Get the ConnectCA Root
    $ curl -s localhost:8500/v1/connect/ca/roots | jq -r ".Roots[].RootCert" > connect-ca.pem
    
  2. Set the CACert environment variable
    $ export CONSUL_CACERT=./connect-ca.pem
    
  3. Set TLS Consul API Address
    $ export CONSUL_HTTP_ADDR=https://localhost:8501
    
  4. In addition to the above, set the CONSUL_CLIENT_CERT and CONSUL_CLIENT_KEY if you have verify_incoming=true on the Consul agent for Client Cert Authentication.

Once you have the above, launch envoy using the consul connect envoy command.

The steps in the documentation didn’t work for your setup because the docs were written for a -dev agent (with no TLS or ACLs). But I agree that this additional information should have been mentioned in the document’s Extend these Concepts section.

I hope this helps.