Cannot connect two DCs through Mesh Gateway with VMs

I created two data centers DC01 and DC02 according to the documentation, and started the Envoy proxy using consul connect envoy -gateway mesh in both data centers.

  • The version I am using is as follows:

Consul v1.11.4
Revision 944e8ce6
Protocol 2 spoken by default, understands 2 to 3 (agent will automatically use protocol >2 when speaking to compatible agents)

Envoy 1.20.2
envoy version: 4aaf9593152c6996b9da384c8918e9ad4f0abd4d/1.20.2-dev/Clean/RELEASE/BoringSSL


My current status:

  • From the consul members -wan, you can already see the server of DC2 and the status is active.
Node               Address         Status  Type    Build   Protocol  DC    Partition  Segment
64cf80e8a60d.dc01  10.0.1.20:8302  alive   server  1.11.4  2         dc01  default    <all>
6e119d2aca15.dc02  10.0.2.20:8302  alive   server  1.11.4  2         dc02  default    <all>
  • /v1/catalog/services?dc= not working
$ curl http://127.0.0.1:8500/v1/catalog/services?dc=dc01
{"consul":[],"mesh-gateway-dc01":[]}
$ curl http://127.0.0.1:8500/v1/catalog/services?dc=dc02
Remote DC has no server currently reachable
  • The log of envoy shows that the cluster of DC02 cannot be found
    Like: Cluster not found 6e119d2aca15.server.dc02.consul
[2022-03-11 01:35:26.283][497][debug][filter] [source/extensions/filters/listener/tls_inspector/tls_inspector.cc:77] tls inspector: new connection accepted
[2022-03-11 01:35:26.283][497][trace][filter] [source/extensions/filters/listener/tls_inspector/tls_inspector.cc:169] tls inspector: recv: 0
[2022-03-11 01:35:26.285][497][trace][filter] [source/extensions/filters/listener/tls_inspector/tls_inspector.cc:169] tls inspector: recv: 310
[2022-03-11 01:35:26.285][497][trace][filter] [source/extensions/filters/listener/tls_inspector/tls_inspector.cc:139] tls:onALPN(), ALPN: consul/wan-gossip/packet
[2022-03-11 01:35:26.285][497][debug][filter] [source/extensions/filters/listener/tls_inspector/tls_inspector.cc:148] tls:onServerName(), requestedServerName: 6e119d2aca15.server.dc02.consul
[2022-03-11 01:35:26.285][497][trace][filter] [source/extensions/filters/listener/tls_inspector/tls_inspector.cc:191] tls inspector: done: true
[2022-03-11 01:35:26.285][497][debug][filter] [source/common/tcp_proxy/tcp_proxy.cc:190] [C72] new tcp proxy session
[2022-03-11 01:35:26.285][497][trace][connection] [source/common/network/connection_impl.cc:356] [C72] readDisable: disable=true disable_count=0 state=0 buffer_length=0
[2022-03-11 01:35:26.285][497][trace][filter] [source/extensions/filters/network/sni_cluster/sni_cluster.cc:16] [C72] sni_cluster: new connection with server name 6e119d2aca15.server.dc02.consul
[2022-03-11 01:35:26.285][497][debug][filter] [source/common/tcp_proxy/tcp_proxy.cc:340] [C72] Cluster not found 6e119d2aca15.server.dc02.consul
[2022-03-11 01:35:26.285][497][debug][connection] [source/common/network/connection_impl.cc:138] [C72] closing data_to_write=0 type=1
[2022-03-11 01:35:26.285][497][debug][connection] [source/common/network/connection_impl.cc:249] [C72] closing socket: 1
[2022-03-11 01:35:26.285][497][trace][connection] [source/common/network/connection_impl.cc:417] [C72] raising connection event 1
[2022-03-11 01:35:26.285][497][trace][main] [source/common/event/dispatcher_impl.cc:255] item added to deferred deletion list (size=1)
[2022-03-11 01:35:26.285][497][trace][main] [source/common/event/dispatcher_impl.cc:117] clearing deferred deletion list (size=1)

My Consul configuration file is as follows:

  • DC01 ( Primary ):
{
  "datacenter": "dc01",
  "bootstrap": true,
  "connect": {
    "enabled": true,
    "enable_mesh_gateway_wan_federation": true
  },
  "ports": {
    "https": 8501,
    "grpc": 8502
  },
  "ui_config": {
    "enabled": true
  },
  "server": true,
  "log_level": "trace",
  "cert_file": "/consul/config/certs/dc01-server-consul-2.pem",
  "key_file": "/consul/config/certs/dc01-server-consul-2-key.pem",
  "ca_file": "/consul/config/certs/consul-agent-ca.pem",
  "verify_incoming_rpc": true,
  "verify_outgoing": true,
  "verify_server_hostname": true,
  "auto_encrypt": {
    "allow_tls": true
  }
}
  • DC02 ( Secondary):
{
  "datacenter": "dc02",
  "bootstrap": true,
  "primary_datacenter": "dc01",
  "primary_gateways": [
    "<dc01-envoy-public-ip>:8443"
  ],
  "connect": {
    "enabled": true,
    "enable_mesh_gateway_wan_federation": true
  },
  "enable_central_service_config": true,
  "ports": {
    "https": 8501,
    "grpc": 8502
  },
  "ui_config": {
    "enabled": true
  },
  "server": true,
  "log_level": "trace",
  "cert_file": "/consul/config/certs/dc02-server-consul-2.pem",
  "key_file": "/consul/config/certs/dc02-server-consul-2-key.pem",
  "ca_file": "/consul/config/certs/consul-agent-ca.pem",
  "verify_incoming_rpc": true,
  "verify_outgoing": true,
  "verify_server_hostname": true,
  "auto_encrypt": {
    "allow_tls": true
  }
}

My Envoy proxy Cluster contains the following:

  • DC01:
local_agent::observability_name::local_agent
local_agent::default_priority::max_connections::1024
local_agent::default_priority::max_pending_requests::1024
local_agent::default_priority::max_requests::1024
local_agent::default_priority::max_retries::3
local_agent::high_priority::max_connections::1024
local_agent::high_priority::max_pending_requests::1024
local_agent::high_priority::max_requests::1024
local_agent::high_priority::max_retries::3
local_agent::added_via_api::false
local_agent::10.0.1.20:8502::cx_active::1
local_agent::10.0.1.20:8502::cx_connect_fail::0
local_agent::10.0.1.20:8502::cx_total::1
local_agent::10.0.1.20:8502::rq_active::1
local_agent::10.0.1.20:8502::rq_error::0
local_agent::10.0.1.20:8502::rq_success::0
local_agent::10.0.1.20:8502::rq_timeout::0
local_agent::10.0.1.20:8502::rq_total::1
local_agent::10.0.1.20:8502::hostname::
local_agent::10.0.1.20:8502::health_flags::healthy
local_agent::10.0.1.20:8502::weight::1
local_agent::10.0.1.20:8502::region::
local_agent::10.0.1.20:8502::zone::
local_agent::10.0.1.20:8502::sub_zone::
local_agent::10.0.1.20:8502::canary::false
local_agent::10.0.1.20:8502::priority::0
local_agent::10.0.1.20:8502::success_rate::-1.0
local_agent::10.0.1.20:8502::local_origin_success_rate::-1.0
consul-dc01.server.dc01.consul::observability_name::consul-dc01.server.dc01.consul
consul-dc01.server.dc01.consul::outlier::success_rate_average::-1
consul-dc01.server.dc01.consul::outlier::success_rate_ejection_threshold::-1
consul-dc01.server.dc01.consul::outlier::local_origin_success_rate_average::-1
consul-dc01.server.dc01.consul::outlier::local_origin_success_rate_ejection_threshold::-1
consul-dc01.server.dc01.consul::default_priority::max_connections::1024
consul-dc01.server.dc01.consul::default_priority::max_pending_requests::1024
consul-dc01.server.dc01.consul::default_priority::max_requests::1024
consul-dc01.server.dc01.consul::default_priority::max_retries::3
consul-dc01.server.dc01.consul::high_priority::max_connections::1024
consul-dc01.server.dc01.consul::high_priority::max_pending_requests::1024
consul-dc01.server.dc01.consul::high_priority::max_requests::1024
consul-dc01.server.dc01.consul::high_priority::max_retries::3
consul-dc01.server.dc01.consul::added_via_api::true
consul-dc01.server.dc01.consul::10.0.1.20:8300::cx_active::0
consul-dc01.server.dc01.consul::10.0.1.20:8300::cx_connect_fail::0
consul-dc01.server.dc01.consul::10.0.1.20:8300::cx_total::0
consul-dc01.server.dc01.consul::10.0.1.20:8300::rq_active::0
consul-dc01.server.dc01.consul::10.0.1.20:8300::rq_error::0
consul-dc01.server.dc01.consul::10.0.1.20:8300::rq_success::0
consul-dc01.server.dc01.consul::10.0.1.20:8300::rq_timeout::0
consul-dc01.server.dc01.consul::10.0.1.20:8300::rq_total::0
consul-dc01.server.dc01.consul::10.0.1.20:8300::hostname::
consul-dc01.server.dc01.consul::10.0.1.20:8300::health_flags::healthy
consul-dc01.server.dc01.consul::10.0.1.20:8300::weight::1
consul-dc01.server.dc01.consul::10.0.1.20:8300::region::
consul-dc01.server.dc01.consul::10.0.1.20:8300::zone::
consul-dc01.server.dc01.consul::10.0.1.20:8300::sub_zone::
consul-dc01.server.dc01.consul::10.0.1.20:8300::canary::false
consul-dc01.server.dc01.consul::10.0.1.20:8300::priority::0
consul-dc01.server.dc01.consul::10.0.1.20:8300::success_rate::-1.0
consul-dc01.server.dc01.consul::10.0.1.20:8300::local_origin_success_rate::-1.0
server.dc01.consul::observability_name::server.dc01.consul
server.dc01.consul::outlier::success_rate_average::-1
server.dc01.consul::outlier::success_rate_ejection_threshold::-1
server.dc01.consul::outlier::local_origin_success_rate_average::-1
server.dc01.consul::outlier::local_origin_success_rate_ejection_threshold::-1
server.dc01.consul::default_priority::max_connections::1024
server.dc01.consul::default_priority::max_pending_requests::1024
server.dc01.consul::default_priority::max_requests::1024
server.dc01.consul::default_priority::max_retries::3
server.dc01.consul::high_priority::max_connections::1024
server.dc01.consul::high_priority::max_pending_requests::1024
server.dc01.consul::high_priority::max_requests::1024
server.dc01.consul::high_priority::max_retries::3
server.dc01.consul::added_via_api::true
server.dc01.consul::10.0.1.20:8300::cx_active::0
server.dc01.consul::10.0.1.20:8300::cx_connect_fail::0
server.dc01.consul::10.0.1.20:8300::cx_total::0
server.dc01.consul::10.0.1.20:8300::rq_active::0
server.dc01.consul::10.0.1.20:8300::rq_error::0
server.dc01.consul::10.0.1.20:8300::rq_success::0
server.dc01.consul::10.0.1.20:8300::rq_timeout::0
server.dc01.consul::10.0.1.20:8300::rq_total::0
server.dc01.consul::10.0.1.20:8300::hostname::
server.dc01.consul::10.0.1.20:8300::health_flags::healthy
server.dc01.consul::10.0.1.20:8300::weight::1
server.dc01.consul::10.0.1.20:8300::region::
server.dc01.consul::10.0.1.20:8300::zone::
server.dc01.consul::10.0.1.20:8300::sub_zone::
server.dc01.consul::10.0.1.20:8300::canary::false
server.dc01.consul::10.0.1.20:8300::priority::0
server.dc01.consul::10.0.1.20:8300::success_rate::-1.0
server.dc01.consul::10.0.1.20:8300::local_origin_success_rate::-1.0
  • DC02:
local_agent::observability_name::local_agent
local_agent::default_priority::max_connections::1024
local_agent::default_priority::max_pending_requests::1024
local_agent::default_priority::max_requests::1024
local_agent::default_priority::max_retries::3
local_agent::high_priority::max_connections::1024
local_agent::high_priority::max_pending_requests::1024
local_agent::high_priority::max_requests::1024
local_agent::high_priority::max_retries::3
local_agent::added_via_api::false
local_agent::10.0.2.20:8502::cx_active::1
local_agent::10.0.2.20:8502::cx_connect_fail::0
local_agent::10.0.2.20:8502::cx_total::1
local_agent::10.0.2.20:8502::rq_active::1
local_agent::10.0.2.20:8502::rq_error::0
local_agent::10.0.2.20:8502::rq_success::0
local_agent::10.0.2.20:8502::rq_timeout::0
local_agent::10.0.2.20:8502::rq_total::1
local_agent::10.0.2.20:8502::hostname::
local_agent::10.0.2.20:8502::health_flags::healthy
local_agent::10.0.2.20:8502::weight::1
local_agent::10.0.2.20:8502::region::
local_agent::10.0.2.20:8502::zone::
local_agent::10.0.2.20:8502::sub_zone::
local_agent::10.0.2.20:8502::canary::false
local_agent::10.0.2.20:8502::priority::0
local_agent::10.0.2.20:8502::success_rate::-1.0
local_agent::10.0.2.20:8502::local_origin_success_rate::-1.0

This question has been bothering me for a long time and I have not been able to find an answer. Here is my configuration, hope it helps here. Pleas for everyone’s help. . .

I have found the cause of the problem, because I am using Docker’s Swarm, so port 8300 is not exported to the Public network, DC01 needs to access port 8300 of DC02 to allow server access