Hi,
I’m trying to get started with consul and gateway meshes. I just want to test out how the gateway works between a k8s cluster and vm cluster. I’ve tried to follow these articles:
- WAN Federation Through Mesh Gateways - VMs and Kubernetes | Consul | HashiCorp Developer
- WAN Federation Through Mesh Gateways - Multiple Kubernetes Clusters | Consul | HashiCorp Developer
I’m just testing on my local pc, deploying to a minikube instance and just setting up consul locally, and trying to connect the two.
So I’m using these helm values to deploy to minikube:
global:
name: consul
datacenter: k8s-primary
tls:
enabled: true
federation:
enabled: true
createFederationSecret: true
acls:
manageSystemACLs: false
createReplicationToken: false
ui:
service:
type: 'NodePort'
connectInject:
enabled: true
meshGateway:
enabled: true
replicas: 1
service:
type: NodePort
nodePort: 30085
wanAddress:
source: Service
affinity: null
server:
affinity: null
And to run consul locally I’m running
consul agent -config-file consul.hcl
With the contents of consul.hcl being:
cert_file = "/home/zane/IdeaProjects/consul/vm-secondary-server-consul-0.pem"
key_file = "/home/zane/IdeaProjects/consul/vm-secondary-server-consul-0-key.pem"
ca_file = "/home/zane/IdeaProjects/consul/consul-agent-ca.pem"
primary_gateways = ["192.168.99.104:30085"]
server = true
datacenter = "vm-secondary"
data_dir = "/home/zane/IdeaProjects/consul/data"
enable_central_service_config = true
primary_datacenter = "k8s-primary"
connect {
enabled = true
enable_mesh_gateway_wan_federation = true
}
verify_incoming_rpc = true
verify_outgoing = true
verify_server_hostname = true
ports {
https = 8501
http = -1
grpc = 8502
}
bind_addr = "192.168.99.1"
bootstrap = false
bootstrap_expect = 1
With this config I’m getting “no acks received” errors. Here are logs from k8s server:
2020-07-23T09:30:21.487Z [INFO] agent.server.memberlist.wan: memberlist: Suspect zane-pc.vm-secondary has failed, no acks received
2020-07-23T09:30:23.686Z [ERROR] agent.server.memberlist.wan: memberlist: Failed to send indirect ping: read tcp 172.17.0.7:36064->172.17.0.5:8443: read: connection reset by peer from=172.17.0.9:8302
2020-07-23T09:30:26.487Z [INFO] agent.server.memberlist.wan: memberlist: Marking zane-pc.vm-secondary as failed, suspect timeout reached (0 peer confirmations)
2020-07-23T09:30:26.487Z [INFO] agent.server.serf.wan: serf: EventMemberFailed: zane-pc.vm-secondary 192.168.99.1
2020-07-23T09:30:26.487Z [INFO] agent.server: Handled event for server in area: event=member-failed server=zane-pc.vm-secondary area=wan
2020-07-23T09:30:26.988Z [ERROR] agent.server.memberlist.wan: memberlist: Failed to send gossip to 192.168.99.1:8302: read tcp 172.17.0.7:36102->172.17.0.5:8443: read: connection reset by peer
2020-07-23T09:30:39.082Z [ERROR] agent.server.rpc: RPC failed to server in DC: server=192.168.99.1:8300 datacenter=vm-secondary method=Internal.ServiceDump error="rpc error getting client: failed to get conn: read tcp 172.17.0.7:36218->172.17.0.5:8443: read: connection reset by peer"
2020-07-23T09:30:48.827Z [INFO] agent.server.serf.wan: serf: EventMemberJoin: zane-pc.vm-secondary 192.168.99.1
2020-07-23T09:30:48.827Z [INFO] agent.server: Handled event for server in area: event=member-join server=zane-pc.vm-secondary area=wan
2020-07-23T09:30:48.991Z [ERROR] agent.server.memberlist.wan: memberlist: Failed to send gossip to 192.168.99.1:8302: read tcp 172.17.0.7:36324->172.17.0.5:8443: read: connection reset by peer
2020-07-23T09:30:49.488Z [ERROR] agent.server.memberlist.wan: memberlist: Failed to send gossip to 192.168.99.1:8302: read tcp 172.17.0.7:36330->172.17.0.5:8443: read: connection reset by peer
2020-07-23T09:30:53.186Z [WARN] agent.server.memberlist.wan: memberlist: Refuting a suspect message (from: zane-pc.vm-secondary)
2020-07-23T09:30:53.488Z [ERROR] agent.server.memberlist.wan: memberlist: Failed to send gossip to 192.168.99.1:8302: read tcp 172.17.0.7:36366->172.17.0.5:8443: read: connection reset by peer
2020-07-23T09:30:53.687Z [ERROR] agent.server.memberlist.wan: memberlist: Failed to send indirect ping: read tcp 172.17.0.7:36374->172.17.0.5:8443: read: connection reset by peer from=172.17.0.9:8302
2020-07-23T09:30:56.489Z [ERROR] agent.server.memberlist.wan: memberlist: Failed to send ping: read tcp 172.17.0.7:36406->172.17.0.5:8443: read: connection reset by peer
2020-07-23T09:30:57.385Z [ERROR] agent.server.memberlist.wan: memberlist: Failed to send indirect ping: read tcp 172.17.0.7:36424->172.17.0.5:8443: read: connection reset by peer from=172.17.0.8:8302
2020-07-23T09:30:58.989Z [ERROR] agent.server.memberlist.wan: memberlist: Failed to send gossip to 192.168.99.1:8302: read tcp 172.17.0.7:36452->172.17.0.5:8443: read: connection reset by peer
2020-07-23T09:30:59.488Z [ERROR] agent.server.memberlist.wan: memberlist: Failed to send gossip to 192.168.99.1:8302: read tcp 172.17.0.7:36458->172.17.0.5:8443: read: connection reset by peer
2020-07-23T09:31:06.487Z [INFO] agent.server.memberlist.wan: memberlist: Suspect zane-pc.vm-secondary has failed, no acks received
2020-07-23T09:31:06.580Z [ERROR] agent.server.rpc: RPC failed to server in DC: server=192.168.99.1:8300 datacenter=vm-secondary method=Internal.ServiceDump error="rpc error getting client: failed to get conn: read tcp 172.17.0.7:36514->172.17.0.5:8443: read: connection reset by peer"
2020-07-23T09:31:06.589Z [ERROR] agent.server.rpc: RPC failed to server in DC: server=192.168.99.1:8300 datacenter=vm-secondary method=Internal.ServiceDump error="rpc error getting client: failed to get conn: read tcp 172.17.0.7:36520->172.17.0.5:8443: read: connection reset by peer"
2020-07-23T09:31:08.690Z [ERROR] agent.server.memberlist.wan: memberlist: Failed to send indirect ping: read tcp 172.17.0.7:36544->172.17.0.5:8443: read: connection reset by peer from=172.17.0.9:8302
2020-07-23T09:31:12.385Z [ERROR] agent.server.memberlist.wan: memberlist: Failed to send indirect ping: read tcp 172.17.0.7:36582->172.17.0.5:8443: read: connection reset by peer from=172.17.0.8:8302
2020-07-23T09:31:26.488Z [ERROR] agent.server.memberlist.wan: memberlist: Failed to send compound ping and suspect message to 192.168.99.1:8302: read tcp 172.17.0.7:36710->172.17.0.5:8443: read: connection reset by peer
2020-07-23T09:31:28.687Z [ERROR] agent.server.memberlist.wan: memberlist: Failed to send indirect ping: read tcp 172.17.0.7:36744->172.17.0.5:8443: read: connection reset by peer from=172.17.0.9:8302
2020-07-23T09:31:28.883Z [INFO] agent.server.memberlist.wan: memberlist: Marking zane-pc.vm-secondary as failed, suspect timeout reached (0 peer confirmations)
2020-07-23T09:31:28.883Z [INFO] agent.server.serf.wan: serf: EventMemberFailed: zane-pc.vm-secondary 192.168.99.1
2020-07-23T09:31:28.883Z [INFO] agent.server: Handled event for server in area: event=member-failed server=zane-pc.vm-secondary area=wan
2020-07-23T09:31:31.487Z [INFO] agent.server.memberlist.wan: memberlist: Suspect zane-pc.vm-secondary has failed, no acks received
2020-07-23T09:31:40.629Z [ERROR] agent.server.rpc: RPC failed to server in DC: server=192.168.99.1:8300 datacenter=vm-secondary method=Internal.ServiceDump error="rpc error getting client: failed to get conn: read tcp 172.17.0.7:36848->172.17.0.5:8443: read: connection reset by peer"
2020-07-23T09:31:49.302Z [ERROR] agent.server.rpc: RPC failed to server in DC: server=192.168.99.1:8300 datacenter=vm-secondary method=Internal.ServiceDump error="rpc error getting client: failed to get conn: read tcp 172.17.0.7:36938->172.17.0.5:8443: read: connection reset by peer"
2020-07-23T09:31:56.502Z [INFO] agent.server.serf.wan: serf: attempting reconnect to zane-pc.vm-secondary 192.168.99.1:8302
2020-07-23T09:32:20.472Z [ERROR] agent.server.rpc: RPC failed to server in DC: server=192.168.99.1:8300 datacenter=vm-secondary method=Internal.ServiceDump error="rpc error getting client: failed to get conn: read tcp 172.17.0.7:37206->172.17.0.5:8443: read: connection reset by peer"
2020-07-23T09:32:51.619Z [ERROR] agent.server.rpc: RPC failed to server in DC: server=192.168.99.1:8300 datacenter=vm-secondary method=Internal.ServiceDump error="rpc error getting client: failed to get conn: read tcp 172.17.0.7:37480->172.17.0.5:8443: read: connection reset by peer"
2020-07-23T09:32:52.090Z [ERROR] agent.server.rpc: RPC failed to server in DC: server=192.168.99.1:8300 datacenter=vm-secondary method=Internal.ServiceDump error="rpc error getting client: failed to get conn: read tcp 172.17.0.7:37490->172.17.0.5:8443: read: connection reset by peer"
2020-07-23T09:33:00.535Z [ERROR] agent.server.rpc: RPC failed to server in DC: server=192.168.99.1:8300 datacenter=vm-secondary method=Internal.ServiceDump error="rpc error getting client: failed to get conn: read tcp 172.17.0.7:37568->172.17.0.5:8443: read: connection reset by peer"
And some from the local server:
BootstrapExpect is set to 1; this is the same as Bootstrap mode.
bootstrap = true: do not enable unless necessary
==> Starting Consul agent...
Version: 'v1.8.0'
Node ID: 'ac4d0768-3b68-a8d5-0fe3-bb05c0822df1'
Node name: 'zane-pc'
Datacenter: 'vm-secondary' (Segment: '<all>')
Server: true (Bootstrap: true)
Client Addr: [127.0.0.1] (HTTP: -1, HTTPS: 8501, gRPC: 8502, DNS: 8600)
Cluster Addr: 192.168.99.1 (LAN: 8301, WAN: 8302)
Encrypt: Gossip: false, TLS-Outgoing: true, TLS-Incoming: false, Auto-Encrypt-TLS: false
==> Log data will now stream in as it occurs:
2020-07-23T11:26:27.626+0200 [INFO] agent.server.gateway_locator: will dial the primary datacenter through its mesh gateways
2020-07-23T11:26:27.827+0200 [INFO] agent.server.raft: initial configuration: index=1 servers="[{Suffrage:Voter ID:ac4d0768-3b68-a8d5-0fe3-bb05c0822df1 Address:192.168.99.1:8300}]"
2020-07-23T11:26:27.827+0200 [INFO] agent.server.raft: entering follower state: follower="Node at 192.168.99.1:8300 [Follower]" leader=
2020-07-23T11:26:27.828+0200 [INFO] agent.server.serf.wan: serf: EventMemberJoin: zane-pc.vm-secondary 192.168.99.1
2020-07-23T11:26:27.828+0200 [INFO] agent.server.serf.lan: serf: EventMemberJoin: zane-pc 192.168.99.1
2020-07-23T11:26:27.829+0200 [INFO] agent.server: Adding LAN server: server="zane-pc (Addr: tcp/192.168.99.1:8300) (DC: vm-secondary)"
2020-07-23T11:26:27.829+0200 [INFO] agent: Started DNS server: address=127.0.0.1:8600 network=udp
2020-07-23T11:26:27.829+0200 [INFO] agent.server: Handled event for server in area: event=member-join server=zane-pc.vm-secondary area=wan
2020-07-23T11:26:27.829+0200 [INFO] agent: Started DNS server: address=127.0.0.1:8600 network=tcp
2020-07-23T11:26:27.830+0200 [INFO] agent: Started HTTPS server: address=127.0.0.1:8501 network=tcp
2020-07-23T11:26:27.830+0200 [INFO] agent: Started gRPC server: address=127.0.0.1:8502 network=tcp
2020-07-23T11:26:27.830+0200 [INFO] agent: started state syncer
==> Consul agent running!
2020-07-23T11:26:27.830+0200 [INFO] agent: Refreshing mesh gateways is supported for the following discovery methods: discovery_methods="aliyun aws azure digitalocean gce k8s linode mdns os packet scaleway softlayer tencentcloud triton vsphere"
2020-07-23T11:26:27.830+0200 [INFO] agent: Refreshing mesh gateways...
2020-07-23T11:26:27.830+0200 [INFO] agent.server.gateway_locator: updated fallback list of primary mesh gateways: mesh_gateways=[192.168.99.104:30085]
2020-07-23T11:26:27.830+0200 [INFO] agent: Refreshing mesh gateways completed
2020-07-23T11:26:27.830+0200 [INFO] agent: Retry join is supported for the following discovery methods: cluster=WAN discovery_methods="aliyun aws azure digitalocean gce k8s linode mdns os packet scaleway softlayer tencentcloud triton vsphere"
2020-07-23T11:26:27.830+0200 [INFO] agent: Joining cluster...: cluster=WAN
2020-07-23T11:26:27.830+0200 [INFO] agent: (WAN) joining: wan_addresses=[*.k8s-primary/192.0.2.2]
2020-07-23T11:26:27.881+0200 [INFO] agent.server.serf.wan: serf: EventMemberJoin: consul-server-1.k8s-primary 172.17.0.8
2020-07-23T11:26:27.881+0200 [INFO] agent.server.serf.wan: serf: EventMemberJoin: consul-server-2.k8s-primary 172.17.0.9
2020-07-23T11:26:27.881+0200 [INFO] agent.server.serf.wan: serf: EventMemberJoin: consul-server-0.k8s-primary 172.17.0.7
2020-07-23T11:26:27.881+0200 [INFO] agent.server: Handled event for server in area: event=member-join server=consul-server-1.k8s-primary area=wan
2020-07-23T11:26:27.881+0200 [INFO] agent: (WAN) joined: number_of_nodes=1
2020-07-23T11:26:27.881+0200 [INFO] agent: Join cluster completed. Synced with initial agents: cluster=WAN num_agents=1
2020-07-23T11:26:27.881+0200 [INFO] agent.server: Handled event for server in area: event=member-join server=consul-server-2.k8s-primary area=wan
2020-07-23T11:26:27.881+0200 [INFO] agent.server: Handled event for server in area: event=member-join server=consul-server-0.k8s-primary area=wan
2020-07-23T11:26:34.764+0200 [WARN] agent.server.raft: heartbeat timeout reached, starting election: last-leader=
2020-07-23T11:26:34.764+0200 [INFO] agent.server.raft: entering candidate state: node="Node at 192.168.99.1:8300 [Candidate]" term=2
2020-07-23T11:26:34.856+0200 [INFO] agent.server.raft: election won: tally=1
2020-07-23T11:26:34.857+0200 [INFO] agent.server.raft: entering leader state: leader="Node at 192.168.99.1:8300 [Leader]"
2020-07-23T11:26:34.857+0200 [INFO] agent.server: cluster leadership acquired
2020-07-23T11:26:34.857+0200 [INFO] agent.server: New leader elected: payload=zane-pc
2020-07-23T11:26:34.997+0200 [INFO] agent: Synced node info
2020-07-23T11:26:35.125+0200 [INFO] agent.server.connect: received new intermediate certificate from primary datacenter
2020-07-23T11:26:35.163+0200 [INFO] agent.server.connect: updated root certificates from primary datacenter
2020-07-23T11:26:35.163+0200 [INFO] agent.server.connect: initialized secondary datacenter CA with provider: provider=consul
2020-07-23T11:26:35.163+0200 [INFO] agent.leader: started routine: routine="config entry replication"
2020-07-23T11:26:35.163+0200 [INFO] agent.leader: started routine: routine="federation state replication"
2020-07-23T11:26:35.163+0200 [INFO] agent.leader: started routine: routine="federation state anti-entropy"
2020-07-23T11:26:35.163+0200 [INFO] agent.leader: started routine: routine="secondary CA roots watch"
2020-07-23T11:26:35.163+0200 [INFO] agent.leader: started routine: routine="intention replication"
2020-07-23T11:26:35.163+0200 [INFO] agent.leader: started routine: routine="secondary cert renew watch"
2020-07-23T11:26:35.163+0200 [INFO] agent.leader: started routine: routine="CA root pruning"
2020-07-23T11:26:35.163+0200 [INFO] agent.server: member joined, marking health alive: member=zane-pc
2020-07-23T11:26:35.164+0200 [INFO] agent.server.gateway_locator: will dial the primary datacenter using our local mesh gateways if possible
2020-07-23T11:26:35.168+0200 [INFO] agent.server: federation state anti-entropy synced
2020-07-23T11:26:35.244+0200 [INFO] agent.server: federation state anti-entropy synced
2020-07-23T11:26:37.828+0200 [INFO] agent.server.memberlist.wan: memberlist: Suspect consul-server-0.k8s-primary has failed, no acks received
2020-07-23T11:26:41.828+0200 [INFO] agent.server.gateway_locator: new cached locations of mesh gateways: primary=[192.168.99.104:30085] local=[]
2020-07-23T11:26:52.828+0200 [INFO] agent.server.memberlist.wan: memberlist: Suspect consul-server-1.k8s-primary has failed, no acks received
2020-07-23T11:27:07.828+0200 [INFO] agent.server.memberlist.wan: memberlist: Marking consul-server-0.k8s-primary as failed, suspect timeout reached (0 peer confirmations)
2020-07-23T11:27:07.828+0200 [INFO] agent.server.serf.wan: serf: EventMemberFailed: consul-server-0.k8s-primary 172.17.0.7
2020-07-23T11:27:07.828+0200 [INFO] agent.server: Handled event for server in area: event=member-failed server=consul-server-0.k8s-primary area=wan
2020-07-23T11:27:12.828+0200 [INFO] agent.server.memberlist.wan: memberlist: Suspect consul-server-2.k8s-primary has failed, no acks received
2020-07-23T11:27:22.828+0200 [INFO] agent.server.memberlist.wan: memberlist: Marking consul-server-1.k8s-primary as failed, suspect timeout reached (0 peer confirmations)
2020-07-23T11:27:22.828+0200 [INFO] agent.server.serf.wan: serf: EventMemberFailed: consul-server-1.k8s-primary 172.17.0.8
2020-07-23T11:27:22.829+0200 [INFO] agent.server: Handled event for server in area: event=member-failed server=consul-server-1.k8s-primary area=wan
2020-07-23T11:27:27.828+0200 [INFO] agent.server.serf.wan: serf: attempting reconnect to consul-server-0.k8s-primary 172.17.0.7:8302
2020-07-23T11:27:27.832+0200 [INFO] agent.server.serf.wan: serf: EventMemberJoin: consul-server-0.k8s-primary 172.17.0.7
2020-07-23T11:27:27.832+0200 [INFO] agent.server.serf.wan: serf: EventMemberJoin: consul-server-1.k8s-primary 172.17.0.8
2020-07-23T11:27:27.832+0200 [WARN] agent.server.memberlist.wan: memberlist: Refuting a suspect message (from: zane-pc.vm-secondary)
2020-07-23T11:27:27.832+0200 [INFO] agent.server: Handled event for server in area: event=member-join server=consul-server-0.k8s-primary area=wan
2020-07-23T11:27:27.832+0200 [INFO] agent.server: Handled event for server in area: event=member-join server=consul-server-1.k8s-primary area=wan
2020-07-23T11:27:37.829+0200 [INFO] agent.server.memberlist.wan: memberlist: Suspect consul-server-1.k8s-primary has failed, no acks received
2020-07-23T11:27:48.657+0200 [WARN] agent.server.memberlist.wan: memberlist: Refuting a suspect message (from: zane-pc.vm-secondary)
2020-07-23T11:28:12.829+0200 [INFO] agent.server.memberlist.wan: memberlist: Suspect consul-server-2.k8s-primary has failed, no acks received
2020-07-23T11:28:42.829+0200 [INFO] agent.server.memberlist.wan: memberlist: Marking consul-server-2.k8s-primary as failed, suspect timeout reached (0 peer confirmations)
2020-07-23T11:28:42.829+0200 [INFO] agent.server.serf.wan: serf: EventMemberFailed: consul-server-2.k8s-primary 172.17.0.9
2020-07-23T11:28:42.829+0200 [INFO] agent.server: Handled event for server in area: event=member-failed server=consul-server-2.k8s-primary area=wan
2020-07-23T11:28:48.661+0200 [INFO] agent.server.serf.wan: serf: EventMemberJoin: consul-server-2.k8s-primary 172.17.0.9
2020-07-23T11:28:48.661+0200 [WARN] agent.server.memberlist.wan: memberlist: Refuting a suspect message (from: zane-pc.vm-secondary)
2020-07-23T11:28:48.661+0200 [INFO] agent.server: Handled event for server in area: event=member-join server=consul-server-2.k8s-primary area=wan
2020-07-23T11:28:52.829+0200 [INFO] agent.server.memberlist.wan: memberlist: Suspect consul-server-0.k8s-primary has failed, no acks received
2020-07-23T11:29:22.829+0200 [INFO] agent.server.memberlist.wan: memberlist: Marking consul-server-0.k8s-primary as failed, suspect timeout reached (0 peer confirmations)
2020-07-23T11:29:22.829+0200 [INFO] agent.server.serf.wan: serf: EventMemberFailed: consul-server-0.k8s-primary 172.17.0.7
2020-07-23T11:29:22.829+0200 [INFO] agent.server: Handled event for server in area: event=member-failed server=consul-server-0.k8s-primary area=wan
2020-07-23T11:29:27.833+0200 [INFO] agent.server.serf.wan: serf: attempting reconnect to consul-server-0.k8s-primary 172.17.0.7:8302
2020-07-23T11:29:27.837+0200 [WARN] agent.server.memberlist.wan: memberlist: Refuting a suspect message (from: zane-pc.vm-secondary)
2020-07-23T11:29:27.837+0200 [INFO] agent.server.serf.wan: serf: EventMemberJoin: consul-server-0.k8s-primary 172.17.0.7
2020-07-23T11:29:27.837+0200 [INFO] agent.server: Handled event for server in area: event=member-join server=consul-server-0.k8s-primary area=wan
2020-07-23T11:29:32.829+0200 [INFO] agent.server.memberlist.wan: memberlist: Suspect consul-server-2.k8s-primary has failed, no acks received
2020-07-23T11:29:48.680+0200 [WARN] agent.server.memberlist.wan: memberlist: Refuting a suspect message (from: zane-pc.vm-secondary)
2020-07-23T11:30:12.829+0200 [INFO] agent.server.memberlist.wan: memberlist: Suspect consul-server-1.k8s-primary has failed, no acks received
Any help or direction here would be really appreciated.