TLDR: After setting up a web server on one VM and configuring Connect and Envoy to access it from another VM, curl localhost:8889
results in: curl: (56) Recv failure: Connection reset by peer
.
Based on both:
- https://learn.hashicorp.com/consul/developer-mesh/connect-envoy
- https://learn.hashicorp.com/consul/developer-mesh/connect-production
I set up a three-VM consul cluster, and a web server and a client on a couple of them as follows:
VM-1: the web server
-
Runs apache web server serving a static test page. Listens on 127.0.0.1:80
-
Runs the
consul-envoy
docker image as described in https://learn.hashicorp.com/consul/developer-mesh/connect-envoy, but using envory 1.13.0, with the following service definition:# /consul/config/web.json { "service": { "connect": { "sidecar_service": {} }, "name": "web", "port": 80 } }
-
The
consul-envoy
container is run using the following Ansible task:- name: Run web service proxy docker_container: name: web-proxy image: consul-envoy auto_remove: yes command: -sidecar-for web network_mode: host volumes: - /consul/certs:/consul/certs:ro env: CONSUL_HTTP_SSL: "true" CONSUL_CACERT: /consul/certs/consul-ca.pem CONSUL_CLIENT_CERT: /consul/certs/consul-1-cli.pem CONSUL_CLIENT_KEY: /consul/certs/consul-1-cli-key.pem
-
From the
consul-envoy
container,curl localhost
works fine and is able to access the web server on the VM. -
The only bit of the
consul-envoy
container logs that seemed relevant is:[1][info][upstream] [source/server/lds_api.cc:73] lds: add/update listener 'public_listener:0.0.0.0:21000'
VM-2: the web client
-
Runs the same
consul-envoy
docker image as VM-1, with the following service definition:# /consul/config/web-client.json { "service": { "connect": { "sidecar_service": { "proxy": { "upstreams": [ { "destination_name": "web", "local_bind_port": 8889 } ] } } }, "name": "web-client", "port": 8888 } }
-
The
consul-envoy
container is run using the following Ansible task:- name: Run web client service proxy docker_container: name: web-client-proxy image: consul-envoy auto_remove: yes command: -sidecar-for web-client network_mode: host volumes: - /consul/certs:/consul/certs:ro env: CONSUL_HTTP_SSL: "true" CONSUL_CACERT: /consul/certs/consul-ca.pem CONSUL_CLIENT_CERT: /consul/certs/consul-2-cli.pem CONSUL_CLIENT_KEY: /consul/certs/consul-2-cli-key.pem
-
Running
curl localhost:8889
results in:curl: (56) Recv failure: Connection reset by peer
This is the problem I’m facing!
Why isn’t the client able to get through to the server?!More generally, how can I go about debuggin this situation? I expected these tools to make it easier to trace issues, but I’m not sure where to start. The
consul-envoy
logs don’t seem to contain anything relevant, AFAICT. I was only able to find this line:[1][info][upstream] [source/server/lds_api.cc:73] lds: add/update listener 'web:127.0.0.1:8889'
Consul Agent
Consul is run as a Docker container using the following Ansible task:
- name: Start consul container
docker_container:
name: "{{ inventory_hostname }}" # consul-1, consul-2, consul-3
image: consul
network_mode: host
command: agent -server -bind={{ ansible_default_ipv4.address }}
volumes:
- /consul/data:/consul/data:rw
- /consul/certs:/consul/certs:ro
- /consul/config:/consul/config:rw
Consul Agent Configuration
/consul/config/agent.json
:
{
"auto_encrypt": {
"allow_tls": true
},
"bootstrap_expect": 3,
"ca_file": "/consul/certs/consul-ca.pem",
"cert_file": "/consul/certs/consul-1.pem", # different file for each agent
"connect": {
"ca_config": {
"private_key": "-----BEGIN EC PRIVATE KEY----- ...",
"root_cert": "-----BEGIN CERTIFICATE----- ..."
},
"ca_provider": "consul",
"enabled": true
},
"datacenter": "test",
"encrypt": "<encryption-key>",
"key_file": "/consul/certs/consul-1-key.pem", # different file for each agent
"performance": {
"raft_multiplier": 1
},
"ports": {
"grpc": 8502,
"http": -1,
"https": 8500
},
"retry_join": [
"192.168.121.89",
"192.168.121.2",
"192.168.121.202"
],
"server": true,
"ui": true,
"verify_incoming": true,
"verify_incoming_rpc": true,
"verify_outgoing": true,
"verify_server_hostname": true
}
I hope I did not miss any relevant information. Please feel free to ask for any.
PS. I think that adding a guide for a similar setup to this to the Learn articles would be valuable. The existing guides are far from production-ready, and are all based on containers on a single host, which is almost never desired.