We are deploying tcp services in our consul service mesh. The services do not have a health check endpoint.
Service A server is listening on a port p1. Service A is healthy and passing health check in consul as well.
Service B client is trying to connect to service A.
Service A with port p1 is configured as upstream in sidecar_service for service B.
However, I am unable to connect to service A via service B over proxy (consul connect). I have enabled envoy debug log and able to see below debug logs:
[2022-05-10 09:37:27.101][17][debug][filter] [source/common/tcp_proxy/tcp_proxy.cc:337] [C220] Creating connection to cluster service-A.default.ap-south-1.internal.e079d7e9-bce7-04f4-d92e-13045ba5dc92.consul
[2022-05-10 09:37:27.101][17][debug][upstream] [source/common/upstream/cluster_manager_impl.cc:1599] no healthy host for TCP connection pool
I am able to connect to service A from service B if I hard-code the service A address in service B directly without proxy.
My Jobs which are running
Service A:
job "service-A-sandbox" {
datacenters = [
"ap-south-1a",
"ap-south-1b",
"ap-south-1c"
]
type = "service"
group "service-A-sandbox" {
count = 1
network {
mode = "bridge"
port "p1" {
to = 9001
}
}
service {
name = "service-A-sandbox"
port = "p1"
connect {
sidecar_service {
tags = [
]
}
}
#Dummy Health check
check {
name = "connect-proxy-service-A-health"
type = "script"
task = "service-A-sandbox"
command = "/bin/sh"
args = ["-c", "ls && exit 0; exit 1"]
interval = "60s"
timeout = "5s"
}
}
task "service-A-sandbox" {
driver = "docker"
config {
image = "https://ghcr.io/github-repo/service-A:Dockerfile"
force_pull = true
}
resources {
cpu = 300
memory = 512
}
}
}
}
Service B:
job "service-B" {
datacenters = [
"ap-south-1a",
"ap-south-1b",
"ap-south-1c"
]
type = "service"
group "service-B" {
count = 1
network {
mode = "bridge"
port "p2" {
to = 3125
}
}
service {
name = "service-B"
port = "p2"
connect {
sidecar_service {
tags = [
]
proxy {
upstreams {
destination_name = "service-A"
local_bind_port = 9001
}
}
}
}
#Dummy Health check
check {
name = "service-B-health"
type = "script"
task = "service-B"
command = "ls"
interval = "5s"
timeout = "3s"
}
}
task "service-B" {
driver = "docker"
config {
image = "https://ghcr.io/github-repo/service-B:Dockerfile"
force_pull = true
}
resources {
cpu = 300
memory = 512
}
}
}
}
Adding TCP communication b/w the servers for log message:
[2022-05-10 09:37:27.101][17][debug][filter] [source/common/tcp_proxy/tcp_proxy.cc:337] [C220] Creating connection to cluster service-A.default.ap-south-1.internal.e079d7e9-bce7-04f4-d92e-13045ba5dc92.consul
[2022-05-10 09:37:27.101][17][debug][upstream] [source/common/upstream/cluster_manager_impl.cc:1599] no healthy host for TCP connection pool
Logs at service B:
<log realm="service-A-channel/127.0.0.1:9001">
<connect>
Try 0 localhost:9001
</connect>
</log>
<log realm="service-A-channel/127.0.0.1:9001">
<receive>
<peer-disconnect/>
</receive>
</log>
<log realm="service-A-channel">
<warn>
channel-receiver-service-A-receive
Read timeout / EOF - reconnecting
</warn>
</log>