Hello Nomad team and community.
I am having troubles configuring Consul connect with Envoy proxy on AWS, and I would appreciate some guidance on how to proceed or troubleshoot it. In short, connect-proxy is throwing warnings in stderr like this:
[2021-04-23 02:03:39.373][1][warning][config] [bazel-out/k8-opt/bin/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:87] gRPC config stream closed: 14, upstream connect error or disconnect/reset before headers. reset reason: connection termination
I am running a cluster on consul-1.9.5, nomad-1.0.4, and envoy-1.16.2.
Here is a test job that uses Consul connect (I dropped healthcheck for the time being, otherwise deployment stucks):
job "http-connect" {
datacenters = ["us-east-1c"]
group "echo" {
network {
mode = "bridge"
}
service {
name = "http-connect"
port = "8080"
connect {
sidecar_service {}
}
}
task "server" {
driver = "docker"
config {
image = "hashicorp/http-echo:latest"
args = [
"-listen", ":8080",
"-text", "Hello and welcome to http-echo running on port 8080",
]
}
}
}
}
This job can be successfully deployed to Nomad and registered in Consul via “nomad run”. I can also use exec connect-proxy alloc and get inside the container.
Security groups in AWS are configured to accept traffic from 8300, 8301, 8302, 8400, 8500, 8502, 8600, 21000-21255 port (pretty much what’s in Required Ports | Consul by HashiCorp except an extra 8400 port). All outbound ports are open.
Nomad agents open 4646, 4647, 4648 (both TCP and UDP) and 20000-32000 dynamic port range.
From a connect-proxy instance, I can query Consul server:
$ curl 10.0.1.145:8500
<a href="/ui/">Moved Permanently</a>.
Another request to Consul 8502 returns something and closes a connection:
$ curl 10.0.1.145:8502 | wc -c
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 42 0 42 0 0 10388 0 --:--:-- --:--:-- --:--:-- 14000
curl: (56) Recv failure: Connection reset by peer
21
Consul intentions allow traffic from all services to all services.
Nonetheless, a curl is stuck on connecting to a sock file, and I suspect that it also results in the “gRPC config stream closed: 14” error that I mentioned above.
$ curl --unix-socket /alloc/tmp/consul_grpc.sock http:/v1/config -v
* Trying /alloc/tmp/consul_grpc.sock...
^C
At this point I am a bit lost, and I would appreciate any ideas what is missing or could be wrong in my setup.
I am likely missing some important detail in the post, but happy to drop config files or anything else if it helps.