“No healthy host for TCP connection pool” - Envoy and Consul

Hello there,

I am facing a weird issue with envoy. Envoy is unable to make communication with upstream service. I have a service deployed on a VM and its trying to reach to the external service via terminating gateway.

I keep on seeing the logs as below.

[2021-08-03 17:37:58.470][28051][debug][filter] [external/envoy/source/common/tcp_proxy/tcp_proxy.cc:242] [C357] new tcp proxy session
[2021-08-03 17:37:58.470][28051][debug][filter] [external/envoy/source/common/tcp_proxy/tcp_proxy.cc:389] [C357] Creating connection to cluster local_app
[2021-08-03 17:37:58.470][28051][debug][pool] [external/envoy/source/common/tcp/original_conn_pool.cc:98] creating a new connection
[2021-08-03 17:37:58.470][28051][debug][pool] [external/envoy/source/common/tcp/original_conn_pool.cc:383] [C358] connecting
[2021-08-03 17:37:58.470][28051][debug][connection] [external/envoy/source/common/network/connection_impl.cc:769] [C358] connecting to 127.0.0.1:9393
[2021-08-03 17:37:58.470][28051][debug][connection] [external/envoy/source/common/network/connection_impl.cc:785] [C358] connection in progress
[2021-08-03 17:37:58.470][28051][debug][pool] [external/envoy/source/common/tcp/original_conn_pool.cc:125] queueing request due to no available connections
[2021-08-03 17:37:58.470][28051][debug][conn_handler] [external/envoy/source/server/connection_handler_impl.cc:476] [C357] new connection
[2021-08-03 17:37:58.470][28049][debug][filter] [external/envoy/source/common/tcp_proxy/tcp_proxy.cc:242] [C359] new tcp proxy session
[2021-08-03 17:37:58.470][28049][debug][filter] [external/envoy/source/common/tcp_proxy/tcp_proxy.cc:389] [C359] Creating connection to cluster static-server-external.default.<domain_name>.internal.68e1f25d-ffe8-a9cc-f88f-fe66cab6d592.consul
[2021-08-03 17:37:58.470][28051][debug][connection] [external/envoy/source/extensions/transport_sockets/tls/ssl_socket.cc:215] [C357] 
[2021-08-03 17:37:58.470][28049][debug][upstream] [external/envoy/source/common/upstream/cluster_manager_impl.cc:1417] no healthy host for TCP connection pool
[2021-08-03 17:37:58.470][28051][debug][connection] [external/envoy/source/common/network/connection_impl.cc:203] [C357] closing socket: 0
[2021-08-03 17:37:58.470][28049][debug][connection] [external/envoy/source/common/network/connection_impl.cc:107] [C359] closing data_to_write=0 type=1
[2021-08-03 17:37:58.470][28049][debug][connection] [external/envoy/source/common/network/connection_impl.cc:203] [C359] closing socket: 1
[2021-08-03 17:37:58.470][28051][debug][pool] [external/envoy/source/common/tcp/original_conn_pool.cc:223] canceling pending request
[2021-08-03 17:37:58.470][28051][debug][pool] [external/envoy/source/common/tcp/original_conn_pool.cc:231] canceling pending connection
[2021-08-03 17:37:58.470][28051][debug][connection] [external/envoy/source/common/network/connection_impl.cc:107] [C358] closing data_to_write=0 type=1
[2021-08-03 17:37:58.470][28051][debug][connection] [external/envoy/source/common/network/connection_impl.cc:203] [C358] closing socket: 1
[2021-08-03 17:37:58.470][28051][debug][pool] [external/envoy/source/common/tcp/original_conn_pool.cc:140] [C358] client disconnected
[2021-08-03 17:37:58.470][28051][debug][conn_handler] [external/envoy/source/server/connection_handler_impl.cc:152] [C357] adding to cleanup list
[2021-08-03 17:37:58.470][28051][debug][pool] [external/envoy/source/common/tcp/original_conn_pool.cc:255] [C358] connection destroyed
[2021-08-03 17:37:58.734][28022][debug][main] [external/envoy/source/server/server.cc:190] flushing stats
[2021-08-03 17:38:03.734][28022][debug][main] [external/envoy/source/server/server.cc:190] flushing stats

A similar setup works perfectly fine on my another environment. Initially I made a mistake of registering the wrong node name and ip for my external service and corrected the same. Even after making this corrrection, the external service used to get derergistered. It was only after i restarted the consul servers, the issue of automatic deregisteration of external service got fixed.

But even then the envoy on a VM is not able to make a communication with the external service via the terminating gateway and complains with error as posted above.

Is this some sort of a caching issue ? Is the consul still confused with the old entry ?

Consul → 1.9.5
Envoy → 1.16.2

Hi @ashwinkupatkar,

Are you monitoring the health state of the external service using something like Consul-ESM? If so, is that service currently reporting as healthy?

Aside from that, in order to better troubleshoot this, can you provide the following info from the Envoy proxies?

  • Output of curl localhost:19000/clusters from the source proxy.
  • Relevant logs from the terminating gateway.
  • Output of curl localhost:19000/clusters from the terminating gateway.

Thanks.

1 Like

Hi @blake , Thanks for responding. I will get back to you on those details.