Hi.
we want to use Consul Connect to connect our applications to shared services instances (Postgres, RabbitMQ) within the same Nomad datacenter/region.
We got it working with
job ... {
...
group ... {
...
service {
name = "rabbitmq-dev"
port = "5672"
connect {
sidecar_service {}
}
}
}
}
on the service side and
job ... {
...
group ... {
service {
name = "${TASKGROUP}-rabbitmq-dev-5672-5672"
port = "5672"
connect {
sidecar_service {
proxy {
upstreams {
destination_name = "rabbitmq-dev"
local_bind_port = 5672
}
}
}
}
}
}
}
on the application side.
However, Envoy terminates idle connections after 1 hour. While this is OK for some deployments, we definitely have situations (like in the night) where connections can idle for a long time. The first time the applications tries to use the connection, an exception is raised since the connections has been terminated by Envoy in the meantime, without reconnecting or even better, propagating the connection drop to the connected client.
The documentation isn’t clear about on which connection (application → sidecar, sidecar → sidecar or sidecar → service) the idle_timeout
is applied.
We’ve tried to set
config {
envoy_gateway_remote_tcp_enable_keepalive = true
envoy_gateway_remote_tcp_keepalive_time = 30
envoy_gateway_remote_tcp_keepalive_interval = 30
}
on the service side and on the application side (in connect → sidecar_service → proxy), without success.
We’re running Nomad 1.4.13 and Consul 1.4.10.
Any help on how to prevent Envoy from dropping the connection or even diagnosing the problem is greatly appreciated.