When using Consul Connect with Envoy, some services fail often, such as:
- APT updates (
apt update
) ->Connection failed
- Docker run ->
500 Server Error: Internal Server Error (\"Get ...: EOF
These clients use a Consul Connect Envoy client proxy to connect to an Envoy server proxy in front of local caching APT and Docker proxies. The APT and Docker proxies may take a while to respond if they have to go fetch some content from the Internet first.
I suspect a short timeout somewhere, but none of the timeout settings I could find (https://www.consul.io/docs/connect/proxies/envoy.html) helped. I still get the errors show above with both connect_timeout_ms
and local_connect_timeout_ms
set to 300000
:
For server services, I’m using a service definition like this:
{
"service": {
"name": "some-server",
"port": 1234,
"connect": {
"sidecar_service": {
"port": 21234,
"proxy": {
"config": {
"local_connect_timeout_ms": 300000
}
}
}
}
}
}
For client services, I’m using a service definition like this:
{
"service": {
"name": "some-client",
"connect": {
"sidecar_service": {
"port": 41234,
"proxy": {
"config": {
"local_connect_timeout_ms": 300000
},
"upstreams": [
{
"destination_name": "some-server",
"local_bind_port": 58081,
"config": {
"connect_timeout_ms": 300000
}
}
]
}
}
}
}
}
- Is this the correct way of using these two timeout settings?
- Are there any other timeout settings that may be causing this problem?
- What else, other than timeouts, could be causing these issues, given that the same services work perfectly when connected to directly without Connect/Envoy?
- How can I debug this issue to find out how Connect/Envoy is failing? I don’t see anything relevant in Envoy logs, and could not configure Envoy access logs using Connect.