Problems with Consul Connect + Mesh Gateways

ShadowSteps · May 6, 2020, 5:48pm

Hello,

I am trying to deploy Connect Gateways in out testing environment [https://learn.hashicorp.com/consul/developer-mesh/connect-gateways] with plans to transfer to production, but a problem occurred and I cannot find any solution online for it.
Setup:

Two WAN connected datacenters [dc1, dc,2]
ACL enabled + replication
TLS enabled
Connect enabled
Envoy Mesh gateways deployed dc1 and dc2.
Connection in both directions for gateways + both healthy to each other in envoy checks.

We started local services for testing purposes (socat) and registered them with settings for mesh-gateway: local. We try to connect but no connection is able to be made. We inspected 4 proxy debug logs and on the local proxy for socat client (socat-web in dc1) there is a problem with the upstream proxy as it shows health_flags::/failed_eds_health. All health checks in both consul datacenters are passing and all other health-checks in envoys (gateway1, gateway2, etc) are healthy with only the upstream failing. We are unable to solve this issue and we believe it causes the problems, because when we try to connect we get “no healthy host for TCP connection pool” in envoy logs for upstream proxy.

I am uploading most of configuration with removed non-important values.
DC2 and secondary gateway are missing, but they are close to dc1 with different ip-s, etc.

consul-client.txt (775 Bytes)
consul-server.txt (1016 Bytes)
dc1-gateway.txt (273 Bytes)
socat.txt (297 Bytes)
socat-web.txt (377 Bytes)

Best regards, Kiril

blake · May 23, 2020, 8:17pm

Hi @ShadowSteps,

Can you share the Envoy proxy logs for the gateways & local proxies? Specifically it would be helpful to see what the messages that are output when you try to initiate a connection to the upstream service. That info may help with debugging.

Thanks.

ballinette · September 14, 2020, 11:54am

Hi
I have the same issue as @ShadowSteps, with quite same settings.

Here are the logs, with both mesh-gateway and sidecar services run in debug mode:

mesh-gateway-primary.log.txt (51.6 KB)
mesh-gateway-secondary.log.txt (52.8 KB)
sidecar-socat-primary.log.txt (30.7 KB)
sidecar-web-secondary.log.txt (39.7 KB)

apart from the “no healthy host for TCP connection pool” line, I have not enough knowledge to interpret this logs…

If anybody can help, thanks in advance…

Additional info:

# consul --version
Consul v1.8.3
Revision a9322b9c7
Protocol 2 spoken by default, understands 2 to 3 (agent will automatically use protocol >2 when speaking to compatible agents)

# envoy --version

envoy  version: 1a0363c885c2dbb1e48b03847dbd706d1ba43eba/1.14.2/clean-getenvoy-fbeeb15-envoy/RELEASE/BoringSSL

ShadowSteps · September 14, 2020, 11:11pm

Hey,

I was never able to resolve the problem in such composition but later I was able to do it without any problems when I installed the mesh gateway on the same machine as the consul-server. This way I was able to run connect without problems, but when mesh gateways and servers were on different machines I had this problem.

Best regards, Kiril

ballinette · September 15, 2020, 12:39pm

Hi,
Thanks for your feedback.

For my part, I have the issue even when the mesh gateway and consul server are on the same machine.

stallion01 · December 9, 2021, 3:23am

I ran into the same issue and after spending an unspecified amount of time troubleshooting, it ended up being the ServiceResolver. You have to create a ServiceResolver for the upstream service (along with proxydefaults) and envoy will populate the IP address of the MGW into the /clusters api

Topic		Replies	Views
Debugging consul mesh gateways Consul	0	437	November 20, 2021
Unable to connect services between datacenters despite working mesh gateways Consul	2	380	August 30, 2021
Connect proxy works fine but Connect envoy does not Consul connect	7	1685	February 10, 2021
No healthy host for TCP connection pool" - Nomad and Consul Connect Nomad connect	2	392	May 25, 2022
“No healthy host for TCP connection pool” - Envoy and Consul Consul	2	1597	August 4, 2021

Problems with Consul Connect + Mesh Gateways

Related topics