I’m working on a Postgres cluster (using patroni), I could get all my 3 instances up and running, and reachable through the service mesh. Each instance has a tag indicating its current role, either master or replica.
Now, I’d like to reach the current master (for write access) or one replica (eg, for backups) through the service mesh.
But now I’m stuck : how can I specify one of those subset in the upstream section of another service ?
service {
name = "backup"
connect {
sidecar_service {
proxy {
upstreams {
# I want it to point on the replica subset of the pg service
destination_name = "pg-replica"
local_bind_port = 5432
}
}
}
}
}
I couldn’t find anything in the documentation on how to use subsets defined in a service-resolver
Kind = "service-resolver"
Name = "pg-master"
Redirect {
Service = "pg"
ServiceSubset = "master"
}
A virtual service for pg-replica
Kind = "service-resolver"
Name = "pg-replica"
Redirect {
Service = "pg"
ServiceSubset = "replica"
}
So everything should be in place. Yet, if I try to use “pg-master” or “pg-replica” in an upstream (as destination_name), I can’t get it to work (I will get something like “read tcp 127.0.0.1:60284->127.0.0.1:5432: read: connection reset by peer”).
One thing that’s not clear in the doc : the intention should target the virtual service “pg-master”, or the real service “pg” (I’ve added both anyway for now, so the problem shouldn’t be on the intentions.
[2022-10-05 09:40:58.607][15][debug][filter] [source/common/tcp_proxy/tcp_proxy.cc:370] [C41] Creating connection to cluster pg-master.default.dc1.internal.c3d20f32-f916-9621-b828-7111cdf716d3.consul
[2022-10-05 09:40:58.607][15][debug][upstream] [source/common/upstream/cluster_manager_impl.cc:1761] no healthy host for TCP connection pool
[2022-10-05 09:40:58.607][15][debug][connection] [source/common/network/connection_impl.cc:139] [C41] closing data_to_write=0 type=1
[2022-10-05 09:40:58.607][15][debug][connection] [source/common/network/connection_impl.cc:250] [C41] closing socket: 1
[2022-10-05 09:40:58.607][15][debug][filter] [source/common/tcp_proxy/tcp_proxy.cc:198] [C42] new tcp proxy session
[2022-10-05 09:40:58.607][15][debug][filter] [source/common/tcp_proxy/tcp_proxy.cc:370] [C42] Creating connection to cluster pg-master.default.dc1.internal.c3d20f32-f916-9621-b828-7111cdf716d3.consul
[2022-10-05 09:40:58.607][15][debug][upstream] [source/common/upstream/cluster_manager_impl.cc:1761] no healthy host for TCP connection pool
[2022-10-05 09:40:58.607][15][debug][connection] [source/common/network/connection_impl.cc:139] [C42] closing data_to_write=0 type=1
[2022-10-05 09:40:58.607][15][debug][connection] [source/common/network/connection_impl.cc:250] [C42] closing socket: 1
Which I find strange because I’m not using the default “.consul” domain name on my consul agents so I’m not sure why it’s trying to connect to pg-master.default.dc1.internal.c3d20f32-f916-9621-b828-7111cdf716d3.consul
I finally got it working. Everything was OK on the service-resolver side, my issue was just an error on my side (sidecar_service accidentaly commented out).
Just one thing to keep in mind when using this to split service : the intention must use the real service destination, not the virtual one corresponding to a subset (in my case, I must use pg as destination in the intention, not pg-master, not pg-replica)
Just another side note : when using Service.Tags in a filter like this, it’s the tags of the sidecar service which matters, not the service itself. In my case, the service is pg, the sidecar is pg-sidecar-proxy. And I must ensure my tags (master & replica) are added to the corresponding pg-sidecar-proxy service, not the pg service (well, I push it on both, but for the mesh to route correctly, only the pg-sidecar-proxy is needed)
Hi @dbd,
I am trying to set up a Patroni cluster in Nomad and struggle with the service resolver.
My main goal is to be able to sidecar the replica subset so services could reach any of the replicas for read only operations.
I’m not interested in exposing the Patroni (nor Postgres) outside of the cluster. Only to other services in the same cluster.
From what I see Patroni does tag the various instances with the corresponding primary or replica tags, but service is still inaccessible.
I tried manually configuring the resolvers but the main router to the service seems broken in the Consul UiI