Unable to connect services between datacenters despite working mesh gateways

Hello, I have two working consul DCs that are federated via a mesh gateway. I can do “consul members -wan” and see the servers in both DCs, and in the interface there are no errors and I can switch between DCs and see the services running locally. My mesh gateways are setup following Connect Services Across Datacenters with Mesh Gateways | Consul - HashiCorp Learn, the only difference being I have disabled ACLs (I can re-enable them if necessary) and my mesh gateways also have “-expose-servers” to perform the WAN federation.

I have tried (several times) the socat example on that page, and also the consul connect count dashboard example (Consul Connect | Nomad by HashiCorp)

The connect example works perfectly when both the api service and the dashboard are located in the same DC, but when I move the dashboard to the secondary DC it does not. In the job definition for the dashboard I have set:

datacenter = "primary"
mesh_gateway {
  mode = "local"
}

I can see no errors in the logs for the sidecar service or on the envoy mesh gateway.

I’m using the latest release of Consul and Nomad.

Any ideas for where to dig next would be greatly appreciated

Thanks

OK, I’ve resolved the issue. It was the “Known Issue” in 1.10.1. reverting to 1.10.0 and everything works perfectly

Hi @kkbe,

Glad you were able to resolve this issue by downgrading versions.

Consul 10.1.2 was released on Aug 28th which fixes the issue where gRPC-based streaming did not work over mesh gateways. Here’s the relevant note from the changelog.

BUG FIXES:
grpc: ensure that streaming gRPC requests work over mesh gateway based wan federation [GH-10838]