Envoy sidecar proxy not listening on configured port

Hi there,

I have a testing consul setup with 3 machines. All the “consul” ports are opened on all the interfaces. I am running consul in server mode in of them and the other two ones are just clients.

All machines are running consul version:

Consul v1.13.1
Revision c6d0f9ec

Then I have two services: frontend and backend. The frontend access the backend via its sidecar proxy (see service setup below).

The UI and the CLI (see below) do not show any errors.

The problem is my frontend is supposed to connect to the backend via the sidecar proxy but the proxy is not listening to the port that I have setup in the consul service configuration.

The only error I am seeing is this one from the consul connect envoy cmd:

[bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:101] DeltaAggregatedResources gRPC config stream closed: 14, upstream connect error or disconnect/reset before headers. reset reason: connection failure

I am suppose to have an envoy instance listening on port 6001 but I don’t see it so my FE cannot connect to my BE:

$ sudo lsof -nP -iTCP -sTCP:LISTEN
COMMAND     PID            USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
sshd        650            root    3u  IPv4  23426      0t0  TCP *:22 (LISTEN)
sshd        650            root    4u  IPv6  23437      0t0  TCP *:22 (LISTEN)
systemd-r 61529 systemd-resolve   13u  IPv4 340217      0t0  TCP 127.0.0.53:53 (LISTEN)
frontend  87885            root    3u  IPv6 528618      0t0  TCP *:6060 (LISTEN)
consul    88960          consul    7u  IPv6 547288      0t0  TCP *:8301 (LISTEN)
consul    88960          consul   10u  IPv6 547295      0t0  TCP *:8600 (LISTEN)
consul    88960          consul   11u  IPv6 547297      0t0  TCP *:8500 (LISTEN)
envoy     88979            root   14u  IPv4 547601      0t0  TCP *:21000 (LISTEN)

Some more configuration details:

$ consul members
Node              Address             Status  Type    Build   Protocol  DC   Partition  Segment
ip-172-31-25-246  172.31.25.246:8301  alive   server  1.13.1  2         dc1  default    <all>
ip-172-31-27-145  172.31.27.145:8301  alive   client  1.13.1  2         dc1  default    <default>
ip-172-31-28-0    172.31.28.0:8301    alive   client  1.13.1  2         dc1  default    <default>

$ consul catalog services
backend
backend-sidecar-proxy
consul
frontend
frontend-sidecar-proxy

consul server config:

$ cat /etc/consul.d/*.hcl | grep -v "^#"
data_dir = "/opt/consul"

connect {
  enabled = true
}

ports {
  grpc = 8502
}

server = true

bootstrap_expect = 1

ui_config {
  enabled = true
}

client_addr = "0.0.0.0"

bind_addr = "0.0.0.0"

consul clients config:

cat /etc/consul.d/consul.hcl | grep -v "^#"
data_dir = "/opt/consul"
client_addr = "0.0.0.0"
bind_addr = "0.0.0.0" # Listen on all IPv4

Frontend service configuration (running in node1):

$ cat /etc/consul.d/frontend.hcl
service {
  name = "frontend"

  # frontend runs on port 6060.
  port = 6060

  # The "connect" stanza configures service mesh
  # features.
  connect {
    sidecar_service {
      # frontend's proxy will listen on port 21000.
      port = 21000

      proxy {
        # The "upstreams" stanza configures
        # which ports the sidecar proxy will expose
        # and what services they'll route to.
        upstreams = [
          {
            # Here you're configuring the sidecar proxy to
            # proxy port 6001 to the backend service.
            destination_name = "backend"
            local_bind_port  = 6001
          }
        ]
      }
    }
  }
}

I run the proxy sidecar service via systemd:

cat /etc/systemd/system/frontend-sidecar-proxy.service
[Unit]
Description="Frontend sidecar proxy service"
Requires=network-online.target
After=network-online.target

[Service]
ExecStart=/usr/bin/consul connect envoy -sidecar-for frontend \
  -admin-bind 0.0.0.0:21000
Restart=on-failure

[Install]
WantedBy=multi-user.target

Any advice or comments are welcome.

Thank you,
-drd

I noticed my envoy version is not compatible with consul. It is a relatively old version despite following the instructions from the envoy website for ubuntu.

$ consul version
Consul v1.13.1
Revision c6d0f9ec
...
$ envoy --version
envoy  version: d362e791eb9e4efa8d87f6d878740e72dc8330ac/1.18.2/clean-getenvoy-76c310e-envoy/RELEASE/BoringSSL

That may be the issue. I am trying to figure now how to get a most updated envoy.

I found the following instructions to install an up to date version of envoy. I installed envoy 1.23.0 but that did not fix the issue.

Hi @drio,

Welcome to the HashiCorp Forums!

Your envoy instance is not working as expected because you don’t have the Consul gRPC port enabled on the client agents. The Consul gRPC port is where Consul hosts the Envoy xDS API, which Envoy uses to fetch its configurations.

Enable port 8502 (ports { grpc = 8502 } }) on your client configurations, restart Consul and then restart envoy, and you should have the sidecars working.

ref:

1 Like

Thank you, Ranjandas. I got it working now.

I was using the example from the consul oreilly book. Unfortunately, the example on the book uses a single node which is not representative of normal setups.

Also, running a simple consul service manually in VMs is very painful. It requires tons of typing and launching commands manually. Luckily I setup a project that provides the infrastructure via terraform (3 nodes: 1 server + 2 workers) and also provisions them (ansible) with all the necessary bits (hcl config, systemd services and entries).

I have two more questions if I may.

I have a typical webapp setup with two services: Frontend and Backend. The Frontend serves the SPA and also gets data from the backend and sends it to user’s browser. The ingress service is running in my consul server (as oppose to the client nodes). Is that alright? Should the ingress service run in the client nodes instead of the server?

As I am working on this, the next logical step is to incorporate nomad into the project to make the creation of services much easier. And that brings me to my next question:

Is it ok to run nomad and consul on the same nodes. I know you are suppose to use different nodes but it seems a bit of wasted resources for lowish loads, specially if you use powerful machines. With that in mind, I was thinking about using:

  • 3 servers (nomad + consul): 2 cpus 8G ram - 64G disk space
  • 3 work nodes (nomad+consul): 2 cpus 8G ram - 32G

What do you think?

-drd

Hi @drio,

I am glad that you got it working. Please find the answers to your questions below.

  1. Running Ingress service on the server instead of the client.
    From the best practices point of view, you shouldn’t run any other workload on your server agents. As you scale your setup, this will become much more important. You can read more about the server agents’ performance requirements and characteristics here: Server Performance | Consul by HashiCorp

    You can also refer to the production readiness checklist and reference architecture to better plan your infrastructure

  2. Running Nomad and Consul on the same node
    Again, I would recommend sticking to HashiCorp’s recommended best practices. But it all boils down to the scale at which you are running. The performance of the server agents (both Consul and Nomad) mustn’t suffer for the smooth functioning of the cluster. You can start running it on the same VMs if you would like to, and as you scale can think of migrating to separate servers when the need comes.

I am sure you knew the above, but I just wanted to state it explicitly for anyone else referring to this post in the future.

I wish you good luck with your project.