We’ve been preparing to start testing Consul 1.10, and it’s interesting to me to see that with the new UpstreamConfig configuration for services, that ConnectTimeoutMs
has the following note:
NOTE: The connect timeout of a service should ideally be configured via the
connect_timeout
field of aservice-resolver
config entry for the upstream destination service. Configuring it in a proxy upstream config will not fully enable some L7 features. It is supported here for backwards compatibility with Consul versions prior to 1.6.0.
This looks like copy & paste from the original proxy.upstreams[].config
field in a proxy service definition: Connect - Envoy Integration | Consul by HashiCorp. I understand the limitations of there, but given service-defaults
works through the discovery chain & everything else modern I was curious if this recommendation still applies here?
I’ve had a bit of a dig through the source and haven’t been able to see which L7 functionality would be lost by configuring the connect timeout on a service-defaults
instead of a service-resolver
. Is there something or should this note be removed?
The reason I ask is, it seems like service-defaults
is a pretty reasonable spot to configure for upstreams these timeouts or at least the default for them. It seems correct that because you’re configuring the originating Envoy’s timeouts that you would want to do it on the originating service’s service-defaults
. If you have a service-splitter
or service-resolver
in place you might have different concrete services which require different timeouts (hence why configurability exists at those config entry layers), but UpstreamConfig
might be a good spot to configure defaults.
As a concrete example:
Kind = "service-defaults"
Name = "counting"
UpstreamConfig = {
Defaults = {
ConnectTimeoutMs = 1000
}
Overrides = [
{
Name = "dashboard"
ConnectTimeoutMs = 2000
}
]
}
vs
Kind = "service-resolver"
Name = "dashboard"
ConnectTimeout = "2s"
Given we’re controlling the timeout for counting -> counting Envoy (THIS ONE) -> dashboard Envoy -> dashboard
, it does make sense to me that the configuration exists for counting
.
The other reason I ask is, I’d love to implement RequestTimeout
as an UpstreamConfig
option, instead of having to create a service-router
for every service to configure it (in our situation, we prefer every originating/egress Envoy to have no outbound request timeout – we enforce these on the ingress side). Before I do so & contribute upstream, I’d like to make sure this change is aligned.