Failed Download Connections (Connect with Envoy)

When using Consul Connect with Envoy, some services fail often, such as:

  • APT updates (apt update) -> Connection failed
  • Docker run -> 500 Server Error: Internal Server Error (\"Get ...: EOF

These clients use a Consul Connect Envoy client proxy to connect to an Envoy server proxy in front of local caching APT and Docker proxies. The APT and Docker proxies may take a while to respond if they have to go fetch some content from the Internet first.

I suspect a short timeout somewhere, but none of the timeout settings I could find (https://www.consul.io/docs/connect/proxies/envoy.html) helped. I still get the errors show above with both connect_timeout_ms and local_connect_timeout_ms set to 300000:

For server services, I’m using a service definition like this:

{
    "service": {
        "name": "some-server",
        "port": 1234,
        "connect": {
            "sidecar_service": {
                "port": 21234,
                "proxy": {
                    "config": {
                        "local_connect_timeout_ms": 300000
                    }
                }
            }
        }
    }
}

For client services, I’m using a service definition like this:

{
    "service": {
        "name": "some-client",
        "connect": {
            "sidecar_service": {
                "port": 41234,
                "proxy": {
                    "config": {
                        "local_connect_timeout_ms": 300000
                    },
                    "upstreams": [
                        {
                            "destination_name": "some-server",
                            "local_bind_port": 58081,
                            "config": {
                                "connect_timeout_ms": 300000
                            }
                        }
                    ]
                }
            }
        }
    }
}
  1. Is this the correct way of using these two timeout settings?
  2. Are there any other timeout settings that may be causing this problem?
  3. What else, other than timeouts, could be causing these issues, given that the same services work perfectly when connected to directly without Connect/Envoy?
  4. How can I debug this issue to find out how Connect/Envoy is failing? I don’t see anything relevant in Envoy logs, and could not configure Envoy access logs using Connect.

Hi @akhayyat,

I suspect you may be running into the issue described in hashicorp/consul#6382. Does this sound like what you’re experienced in your environment?

The linked issue appears to be about HTTP proxies, isn’t it?
Mine is with TCP proxies. I don’t think there is a routes configuration for TCP proxies.

The timeouts configured in my original post seem to have resolved APT’s Connection failed errors, but not Docker’s.

Docker continues to fail with 500 errors even after increasing the timeouts to 600,000 – without any trace on the server logs.