Problems with Consul Connect + Mesh Gateways


I am trying to deploy Connect Gateways in out testing environment [] with plans to transfer to production, but a problem occurred and I cannot find any solution online for it.

  • Two WAN connected datacenters [dc1, dc,2]
  • ACL enabled + replication
  • TLS enabled
  • Connect enabled
  • Envoy Mesh gateways deployed dc1 and dc2.
  • Connection in both directions for gateways + both healthy to each other in envoy checks.

We started local services for testing purposes (socat) and registered them with settings for mesh-gateway: local. We try to connect but no connection is able to be made. We inspected 4 proxy debug logs and on the local proxy for socat client (socat-web in dc1) there is a problem with the upstream proxy as it shows health_flags::/failed_eds_health. All health checks in both consul datacenters are passing and all other health-checks in envoys (gateway1, gateway2, etc) are healthy with only the upstream failing. We are unable to solve this issue and we believe it causes the problems, because when we try to connect we get “no healthy host for TCP connection pool” in envoy logs for upstream proxy.

I am uploading most of configuration with removed non-important values.
DC2 and secondary gateway are missing, but they are close to dc1 with different ip-s, etc.

consul-client.txt (775 Bytes)
consul-server.txt (1016 Bytes)
dc1-gateway.txt (273 Bytes)
socat.txt (297 Bytes)
socat-web.txt (377 Bytes)

Best regards, Kiril

Hi @ShadowSteps,

Can you share the Envoy proxy logs for the gateways & local proxies? Specifically it would be helpful to see what the messages that are output when you try to initiate a connection to the upstream service. That info may help with debugging.