Envoy/Consul Connect - upstream request timeout

Hello.

I’m having some issues with SOAP-requests when using Nomad & Consul Connect. We have som large SOAP-request, where the response might take a long time to be generated. This causes the request to receive “upstream request timeout” after 15s (accessing via traefik → consul-ingress → SOAP-service).

I believe this timeout is from Envoy default settings:

Specifies the upstream timeout for the route. If not specified, the default is 15s. This spans between the point at which the entire downstream request (i.e. end-of-stream) has been processed and when the upstream response has been completely processed. A value of 0 will disable the route’s timeout.

Is there any way to change this settings for the envoy sidecar(s)/proxy via Nomad? It appears it should be possible via the sidecar_task settings, but I am not sure how this should be done properly. Would I need to pass in a completely new config-file to replace the “${NOMAD_SECRETS_DIR}/envoy_bootstrap.json” that is default?

Great question @kds-rune, unfortunately Consul does not yet expose this config option directly:

And it also doesn’t expose the timeout as an escape-hatch option in upstream.config:

(which Nomad currently doesn’t plumb through yet anyway)

It seems the only option for now would be to supply your own complete bootstrap config file, using sidecar_task as you were thinking. You can see the default task config block Nomad uses and use that as a starting point. Unfortunately we also don’t really have documentation around doing this yet.

1 Like

Hello, and thank you for the response. I completely missed the existing issue; thank you for linking.

It seems to be an active topic for quite some time 2019-2021-05), so hopefully it’s something that will get implemented eventually.

In the meantine, I will do my best, using the information on the existing issues. As I could understand from the discussions, there seems to be some sort of working configurations using service-router config (consul) already (9554)

Edit: Seems I use different account on mobile… :joy:

Just a quick update, in case anyone has this issue.

Upgraded to consul 1.10.0-beta, and then it is possible to override the 15s default timeout via consul configs (ref. linked issue from consul in the previous comment):

Backend service-router

{
    "Kind": "service-router",
    "Name": "my-service",
    "Routes": [
        {
            "Match": {
                "HTTP": {}
            },
            "Destination": {
                "RequestTimeout": "10m0s"
            }
        }
    ],
 }
2 Likes

I was just about to suggest checking out the consul 1.10 beta. We have been using that for this exact reason on a POC and tracking the new release So far from a timeout perspective it’s worked well. we have two min long polls on a service (long story…) going from client to ingress gateway to the service and the beta has been no issues

I suspect the GA will drop soon, I see it’s up to RC2 so crossing fingers this will be ready for prod soon