I am trying to implement a simple consul cluster running in a docker swarm of six nodes connected by a private network. I want to have consul available locally on each of the machines and have three server instances in total. Since the server containers will need to address each other individually, I need to use endpoint mode dnsrr
on the servers. For having access to the agent running locally, they also require endpoint mode dnsrr
.
I found no clever way of setting up the placement dynamically so that each machine is either a server or an agent. So I selected deploy mode global
for the agent and regular deployment with three replicas and max_replicas_per_node
set to 1 for the servers. This will result in 3 nodes running both the server and an agent. For the TLS keys, I created two secrets consul_ca
and consul_ca_key
.
For some reason, this setup seems to work fine for the bootstrapping process, so there is some working network connection between the nodes, but for synchronizing the server states, raft seems to run into some network issue. A typical error from the server logs would look like this:
[ERROR] agent.server.raft: failed to appendEntries to: peer="{Voter 7d028a82-7559-0799-bd15-0b2981d4d592 10.0.1.15:8300}" error="read tcp 10.0.1.16:37125->10.0.1.15:8300: i/o timeout"
Any hints on what the issue might be or how I can debug the issue are appreciated. For now, I was focusing on getting the servers up and running, so there might be errors in the agent configuration as well. I included the agents here to show the intended overall network/port setup for the cluster. If you have feedback on improving the overall setup, this is also be very helpful.
Here is the full configuration and logs; the docker-compose.yml
file which is deployed to the swarm using sudo docker stack deploy --compose-file ./docker-compose.yml consul
:
version: '3.9'
services:
server:
image: consul:1.13
command: /bin/sh -c "consul tls cert create -server -ca=/run/secrets/consul_ca -key=/run/secrets/consul_ca_key; consul agent -bootstrap-expect 3 -config-file=/etc/consul.d/server.hcl"
deploy:
replicas: 3
placement:
max_replicas_per_node: 1
endpoint_mode: dnsrr
ports:
- target: 8300
published: 8300
protocol: tcp
mode: host
- target: 8301
published: 8301
protocol: tcp
mode: host
- target: 8301
published: 8301
protocol: udp
mode: host
configs:
- source: server
target: /etc/consul.d/server.hcl
secrets:
- consul_ca
- consul_ca_key
networks:
- consul
volumes:
- data:/consul/data
agent:
image: consul:1.13
command: /bin/sh -c "consul tls cert create -client -ca=/run/secrets/consul_ca -key=/run/secrets/consul_ca_key; consul agent -config-file=/etc/consul.d/agent.hcl"
depends_on:
- server
deploy:
mode: global
endpoint_mode: dnsrr
ports:
- target: 8600
published: 8600
protocol: tcp
mode: host
- target: 8600
published: 8600
protocol: udp
mode: host
configs:
- source: agent
target: /etc/consul.d/agent.hcl
secrets:
- consul_ca
- consul_ca_key
networks:
- consul
configs:
server:
file: ./server.hcl
agent:
file: ./agent.hcl
secrets:
consul_ca:
external: true
consul_ca_key:
external: true
networks:
consul:
attachable: true
volumes:
data:
The secrets are generated using (please replace user
and example.com
by the respective values):
sudo docker run --rm -v /home/user:/ca consul sh -c "cd /ca; consul tls ca create -domain example.com"
cat /home/user/example.com-agent-ca-key.pem | sudo docker secret create consul_ca_key -
cat /home/user/example.com-agent-ca.pem | sudo docker secret create consul_ca -
sudo docker run --rm consul consul keygen
The last line will output a key for the gossip encryption.
The config for the server server.hcl
is (replace GOSSIP
by the gossip key generated in the last step):
server = true
datacenter = "dc1"
bind_addr = "{{ GetInterfaceIP \"eth0\" }}"
data_dir = "/consul/data"
encrypt = "GOSSIP"
ports {
dns = -1
http = -1
https = -1
grpc = -1
server = 8300
serf_lan = 8301
serf_wan = -1
}
tls {
defaults {
ca_file = "/run/secrets/consul_ca"
cert_file = "/dc1-server-consul-0.pem"
key_file = "/dc1-server-consul-0-key.pem"
verify_incoming = true
verify_outgoing = true
}
internal_rpc {
verify_server_hostname = false
}
}
auto_encrypt {
allow_tls = true
}
retry_join = ["consul_server"]
The config for the agent agent.hcl
is (replace GOSSIP
by the gossip key generated above):
server = false
datacenter = "dc1"
bind_addr = "{{ GetInterfaceIP \"eth0\" }}"
data_dir = "/consul/data"
encrypt = "GOSSIP"
ports {
dns = 8600
http = -1
https = -1
grpc = -1
server = 8300
serf_lan = 8301
serf_wan = -1
}
tls {
defaults {
ca_file = "/run/secrets/consul_ca"
cert_file = "/dc1-client-consul-0.pem"
key_file = "/dc1-client-consul-0-key.pem"
verify_incoming = true
verify_outgoing = true
}
internal_rpc {
verify_server_hostname = false
}
}
client_addr = "0.0.0.0"
retry_join = ["consul_server"]
Running sudo docker service logs consul_server
gives the following outputs; first the cluster leader:
==> WARNING: Server Certificates grants authority to become a
server and access all state in the cluster including root keys
and all ACL tokens. Do not distribute them to production hosts
that are not server nodes. Store them as securely as CA keys.
==> Using /run/secrets/consul_ca and /run/secrets/consul_ca_key
==> Saved dc1-server-consul-0.pem
==> Saved dc1-server-consul-0-key.pem
==> Starting Consul agent...
Version: '1.13.1'
Build Date: '2022-08-11 19:07:00 +0000 UTC'
Node ID: 'f145d46b-c4d6-7107-ecea-587e304abd57'
Node name: '728c5555b1d0'
Datacenter: 'dc1' (Segment: '<all>')
Server: true (Bootstrap: false)
Client Addr: [127.0.0.1] (HTTP: -1, HTTPS: -1, gRPC: -1, DNS: -1)
Cluster Addr: 10.0.1.16 (LAN: 8301, WAN: -1)
Encrypt: Gossip: true, TLS-Outgoing: true, TLS-Incoming: true, Auto-Encrypt-TLS: true
consul_server.2.0jddgj4wvt6h@cloudsrv-4 |
==> Log data will now stream in as it occurs:
consul_server.2.0jddgj4wvt6h@cloudsrv-4 |
2022-09-17T20:00:18.633Z [WARN] agent: bootstrap_expect > 0: expecting 3 servers
2022-09-17T20:00:18.660Z [WARN] agent.auto_config: bootstrap_expect > 0: expecting 3 servers
2022-09-17T20:00:18.688Z [INFO] agent.server.raft: initial configuration: index=0 servers=[]
2022-09-17T20:00:18.690Z [INFO] agent.server.serf.lan: serf: EventMemberJoin: 728c5555b1d0 10.0.1.16
2022-09-17T20:00:18.690Z [INFO] agent.router: Initializing LAN area manager
2022-09-17T20:00:18.691Z [INFO] agent.server.autopilot: reconciliation now disabled
2022-09-17T20:00:18.696Z [INFO] agent.server.raft: entering follower state: follower="Node at 10.0.1.16:8300 [Follower]" leader-address= leader-id=
2022-09-17T20:00:18.699Z [INFO] agent.server: Adding LAN server: server="728c5555b1d0 (Addr: tcp/10.0.1.16:8300) (DC: dc1)"
2022-09-17T20:00:18.712Z [INFO] agent: started state syncer
2022-09-17T20:00:18.712Z [INFO] agent: Consul agent running!
2022-09-17T20:00:18.714Z [INFO] agent: Retry join is supported for the following discovery methods: cluster=LAN discovery_methods="aliyun aws azure digitalocean gce k8s linode mdns os packet scaleway softlayer tencentcloud triton vsphere"
2022-09-17T20:00:18.714Z [INFO] agent: Joining cluster...: cluster=LAN
2022-09-17T20:00:18.714Z [INFO] agent: (LAN) joining: lan_addresses=[consul_server]
2022-09-17T20:00:18.723Z [INFO] agent: (LAN) joined: number_of_nodes=1
2022-09-17T20:00:18.723Z [INFO] agent: Join cluster completed. Synced with initial agents: cluster=LAN num_agents=1
2022-09-17T20:00:18.812Z [INFO] agent.server.serf.lan: serf: EventMemberJoin: 82e5134c3a81 10.0.1.15
2022-09-17T20:00:18.812Z [INFO] agent.server: Adding LAN server: server="82e5134c3a81 (Addr: tcp/10.0.1.15:8300) (DC: dc1)"
2022-09-17T20:00:25.913Z [ERROR] agent.anti_entropy: failed to sync remote state: error="No cluster leader"
2022-09-17T20:00:26.411Z [WARN] agent.server.raft: no known peers, aborting election
2022-09-17T20:00:30.307Z [INFO] agent.server.serf.lan: serf: EventMemberJoin: 8523d9956b78 10.0.1.4
2022-09-17T20:00:30.366Z [INFO] agent.server.serf.lan: serf: EventMemberJoin: 465d55d8d2e8 10.0.1.14
2022-09-17T20:00:30.366Z [INFO] agent.server: Adding LAN server: server="465d55d8d2e8 (Addr: tcp/10.0.1.14:8300) (DC: dc1)"
2022-09-17T20:00:30.374Z [INFO] agent.server: Found expected number of peers, attempting bootstrap: peers=10.0.1.14:8300,10.0.1.16:8300,10.0.1.15:8300
2022-09-17T20:00:31.556Z [WARN] agent.server.raft: heartbeat timeout reached, starting election: last-leader-addr= last-leader-id=
2022-09-17T20:00:31.557Z [INFO] agent.server.raft: entering candidate state: node="Node at 10.0.1.16:8300 [Candidate]" term=2
2022-09-17T20:00:31.565Z [INFO] agent.server.raft: election won: tally=2
2022-09-17T20:00:31.565Z [INFO] agent.server.raft: entering leader state: leader="Node at 10.0.1.16:8300 [Leader]"
2022-09-17T20:00:31.566Z [INFO] agent.server.raft: added peer, starting replication: peer=554b6aea-ce76-0267-9ff8-6effd36ace4b
2022-09-17T20:00:31.566Z [INFO] agent.server.raft: added peer, starting replication: peer=7d028a82-7559-0799-bd15-0b2981d4d592
2022-09-17T20:00:31.567Z [INFO] agent.server: cluster leadership acquired
2022-09-17T20:00:31.567Z [INFO] agent.server: New leader elected: payload=728c5555b1d0
2022-09-17T20:00:31.569Z [WARN] agent.server.raft: appendEntries rejected, sending older logs: peer="{Voter 554b6aea-ce76-0267-9ff8-6effd36ace4b 10.0.1.14:8300}" next=1
2022-09-17T20:00:31.572Z [INFO] agent.server.raft: pipelining replication: peer="{Voter 554b6aea-ce76-0267-9ff8-6effd36ace4b 10.0.1.14:8300}"
2022-09-17T20:00:31.573Z [INFO] agent.server.raft: pipelining replication: peer="{Voter 7d028a82-7559-0799-bd15-0b2981d4d592 10.0.1.15:8300}"
2022-09-17T20:00:31.583Z [INFO] agent.server.autopilot: reconciliation now enabled
2022-09-17T20:00:31.585Z [INFO] agent.leader: started routine: routine="federation state anti-entropy"
2022-09-17T20:00:31.585Z [INFO] agent.leader: started routine: routine="federation state pruning"
2022-09-17T20:00:38.635Z [ERROR] agent.server: error performing anti-entropy sync of federation state: error="Not ready to serve consistent reads"
2022-09-17T20:00:40.962Z [INFO] agent.server.serf.lan: serf: EventMemberJoin: 84b80437a8af 10.0.1.6
2022-09-17T20:00:41.601Z [ERROR] agent.server.raft: failed to pipeline appendEntries: peer="{Voter 7d028a82-7559-0799-bd15-0b2981d4d592 10.0.1.15:8300}" error="use of closed network connection"
2022-09-17T20:00:41.601Z [INFO] agent.server.raft: aborting pipeline replication: peer="{Voter 7d028a82-7559-0799-bd15-0b2981d4d592 10.0.1.15:8300}"
2022-09-17T20:00:41.609Z [ERROR] agent.server.raft: failed to pipeline appendEntries: peer="{Voter 554b6aea-ce76-0267-9ff8-6effd36ace4b 10.0.1.14:8300}" error="use of closed network connection"
2022-09-17T20:00:41.609Z [INFO] agent.server.raft: aborting pipeline replication: peer="{Voter 554b6aea-ce76-0267-9ff8-6effd36ace4b 10.0.1.14:8300}"
2022-09-17T20:00:43.814Z [INFO] agent.server.serf.lan: serf: EventMemberJoin: ddd0f09a4f97 10.0.1.5
2022-09-17T20:00:43.852Z [INFO] agent.server.serf.lan: serf: EventMemberJoin: ea8ef0287938 10.0.1.7
2022-09-17T20:00:44.389Z [INFO] agent.server.serf.lan: serf: EventMemberJoin: df3ebf259660 10.0.1.2
2022-09-17T20:00:44.540Z [INFO] agent.server.serf.lan: serf: EventMemberJoin: 577e36c1936c 10.0.1.3
2022-09-17T20:00:47.841Z [ERROR] agent.server: error performing anti-entropy sync of federation state: error="Not ready to serve consistent reads"
2022-09-17T20:00:51.661Z [ERROR] agent.server.raft: failed to appendEntries to: peer="{Voter 554b6aea-ce76-0267-9ff8-6effd36ace4b 10.0.1.14:8300}" error="read tcp 10.0.1.16:36469->10.0.1.14:8300: i/o timeout"
2022-09-17T20:00:51.700Z [ERROR] agent.server.raft: failed to appendEntries to: peer="{Voter 7d028a82-7559-0799-bd15-0b2981d4d592 10.0.1.15:8300}" error="read tcp 10.0.1.16:37125->10.0.1.15:8300: i/o timeout"
2022-09-17T20:00:54.311Z [ERROR] agent.server.memberlist.lan: memberlist: failed to receive and remove the stream label header: read tcp 10.0.1.16:8301->10.0.1.2:60708: i/o timeout from=<unknown address>
2022-09-17T20:00:58.912Z [ERROR] agent.server: error performing anti-entropy sync of federation state: error="Not ready to serve consistent reads"
2022-09-17T20:01:01.673Z [ERROR] agent.server.raft: failed to appendEntries to: peer="{Voter 554b6aea-ce76-0267-9ff8-6effd36ace4b 10.0.1.14:8300}" error="read tcp 10.0.1.16:50263->10.0.1.14:8300: i/o timeout"
2022-09-17T20:01:01.711Z [ERROR] agent.server.raft: failed to appendEntries to: peer="{Voter 7d028a82-7559-0799-bd15-0b2981d4d592 10.0.1.15:8300}" error="read tcp 10.0.1.16:36939->10.0.1.15:8300: i/o timeout"
2022-09-17T20:01:11.685Z [ERROR] agent.server.raft: failed to appendEntries to: peer="{Voter 554b6aea-ce76-0267-9ff8-6effd36ace4b 10.0.1.14:8300}" error="read tcp 10.0.1.16:37499->10.0.1.14:8300: i/o timeout"
2022-09-17T20:01:11.722Z [ERROR] agent.server.raft: failed to appendEntries to: peer="{Voter 7d028a82-7559-0799-bd15-0b2981d4d592 10.0.1.15:8300}" error="read tcp 10.0.1.16:41267->10.0.1.15:8300: i/o timeout"
2022-09-17T20:01:13.939Z [ERROR] agent.server: error performing anti-entropy sync of federation state: error="Not ready to serve consistent reads"
2022-09-17T20:01:21.707Z [ERROR] agent.server.raft: failed to appendEntries to: peer="{Voter 554b6aea-ce76-0267-9ff8-6effd36ace4b 10.0.1.14:8300}" error="read tcp 10.0.1.16:36831->10.0.1.14:8300: i/o timeout"
2022-09-17T20:01:21.744Z [ERROR] agent.server.raft: failed to appendEntries to: peer="{Voter 7d028a82-7559-0799-bd15-0b2981d4d592 10.0.1.15:8300}" error="read tcp 10.0.1.16:33295->10.0.1.15:8300: i/o timeout"
2022-09-17T20:01:24.868Z [ERROR] agent.server.memberlist.lan: memberlist: Push/Pull with 82e5134c3a81 failed: read tcp 10.0.1.16:58220->10.0.1.15:8301: i/o timeout
2022-09-17T20:01:31.749Z [ERROR] agent.server.raft: failed to appendEntries to: peer="{Voter 554b6aea-ce76-0267-9ff8-6effd36ace4b 10.0.1.14:8300}" error="read tcp 10.0.1.16:34645->10.0.1.14:8300: i/o timeout"
2022-09-17T20:01:31.785Z [ERROR] agent.server.raft: failed to appendEntries to: peer="{Voter 7d028a82-7559-0799-bd15-0b2981d4d592 10.0.1.15:8300}" error="read tcp 10.0.1.16:55819->10.0.1.15:8300: i/o timeout"
2022-09-17T20:01:37.065Z [ERROR] agent.server: error performing anti-entropy sync of federation state: error="Not ready to serve consistent reads"
2022-09-17T20:01:41.832Z [ERROR] agent.server.raft: failed to appendEntries to: peer="{Voter 554b6aea-ce76-0267-9ff8-6effd36ace4b 10.0.1.14:8300}" error="read tcp 10.0.1.16:32881->10.0.1.14:8300: i/o timeout"
2022-09-17T20:01:41.870Z [ERROR] agent.server.raft: failed to appendEntries to: peer="{Voter 7d028a82-7559-0799-bd15-0b2981d4d592 10.0.1.15:8300}" error="read tcp 10.0.1.16:33019->10.0.1.15:8300: i/o timeout"
2022-09-17T20:01:51.993Z [ERROR] agent.server.raft: failed to appendEntries to: peer="{Voter 554b6aea-ce76-0267-9ff8-6effd36ace4b 10.0.1.14:8300}" error="read tcp 10.0.1.16:50971->10.0.1.14:8300: i/o timeout"
2022-09-17T20:01:52.031Z [ERROR] agent.server.raft: failed to appendEntries to: peer="{Voter 7d028a82-7559-0799-bd15-0b2981d4d592 10.0.1.15:8300}" error="read tcp 10.0.1.16:55439->10.0.1.15:8300: i/o timeout"
2022-09-17T20:02:02.315Z [ERROR] agent.server.raft: failed to appendEntries to: peer="{Voter 554b6aea-ce76-0267-9ff8-6effd36ace4b 10.0.1.14:8300}" error="read tcp 10.0.1.16:52609->10.0.1.14:8300: i/o timeout"
2022-09-17T20:02:02.354Z [ERROR] agent.server.raft: failed to appendEntries to: peer="{Voter 7d028a82-7559-0799-bd15-0b2981d4d592 10.0.1.15:8300}" error="read tcp 10.0.1.16:57331->10.0.1.15:8300: i/o timeout"
2022-09-17T20:02:04.872Z [ERROR] agent.server.memberlist.lan: memberlist: Push/Pull with ddd0f09a4f97 failed: read tcp 10.0.1.16:54944->10.0.1.5:8301: i/o timeout
2022-09-17T20:02:12.957Z [ERROR] agent.server.raft: failed to appendEntries to: peer="{Voter 554b6aea-ce76-0267-9ff8-6effd36ace4b 10.0.1.14:8300}" error="read tcp 10.0.1.16:35751->10.0.1.14:8300: i/o timeout"
2022-09-17T20:02:12.996Z [ERROR] agent.server.raft: failed to appendEntries to: peer="{Voter 7d028a82-7559-0799-bd15-0b2981d4d592 10.0.1.15:8300}" error="read tcp 10.0.1.16:37279->10.0.1.15:8300: i/o timeout"
2022-09-17T20:02:16.189Z [ERROR] agent.server: error performing anti-entropy sync of federation state: error="Not ready to serve consistent reads"
2022-09-17T20:02:24.298Z [ERROR] agent.server.raft: failed to appendEntries to: peer="{Voter 554b6aea-ce76-0267-9ff8-6effd36ace4b 10.0.1.14:8300}" error="read tcp 10.0.1.16:56913->10.0.1.14:8300: i/o timeout"
2022-09-17T20:02:24.354Z [ERROR] agent.server.raft: failed to appendEntries to: peer="{Voter 7d028a82-7559-0799-bd15-0b2981d4d592 10.0.1.15:8300}" error="read tcp 10.0.1.16:54365->10.0.1.15:8300: i/o timeout"
Log for one of the two followers (the other one is very similar):
==> WARNING: Server Certificates grants authority to become a
server and access all state in the cluster including root keys
and all ACL tokens. Do not distribute them to production hosts
that are not server nodes. Store them as securely as CA keys.
==> Using /run/secrets/consul_ca and /run/secrets/consul_ca_key
==> Saved dc1-server-consul-0.pem
==> Saved dc1-server-consul-0-key.pem
==> Starting Consul agent...
Version: '1.13.1'
Build Date: '2022-08-11 19:07:00 +0000 UTC'
Node ID: '7d028a82-7559-0799-bd15-0b2981d4d592'
Node name: '82e5134c3a81'
Datacenter: 'dc1' (Segment: '<all>')
Server: true (Bootstrap: false)
Client Addr: [127.0.0.1] (HTTP: -1, HTTPS: -1, gRPC: -1, DNS: -1)
Cluster Addr: 10.0.1.15 (LAN: 8301, WAN: -1)
Encrypt: Gossip: true, TLS-Outgoing: true, TLS-Incoming: true, Auto-Encrypt-TLS: true
consul_server.1.ncdplcdccve8@cloudsrv-6 |
==> Log data will now stream in as it occurs:
consul_server.1.ncdplcdccve8@cloudsrv-6 |
2022-09-17T20:00:18.757Z [WARN] agent: bootstrap_expect > 0: expecting 3 servers
2022-09-17T20:00:18.773Z [WARN] agent.auto_config: bootstrap_expect > 0: expecting 3 servers
2022-09-17T20:00:18.786Z [INFO] agent.server.raft: initial configuration: index=0 servers=[]
2022-09-17T20:00:18.787Z [INFO] agent.server.serf.lan: serf: EventMemberJoin: 82e5134c3a81 10.0.1.15
2022-09-17T20:00:18.787Z [INFO] agent.router: Initializing LAN area manager
2022-09-17T20:00:18.788Z [INFO] agent.server.autopilot: reconciliation now disabled
2022-09-17T20:00:18.791Z [INFO] agent.server.raft: entering follower state: follower="Node at 10.0.1.15:8300 [Follower]" leader-address= leader-id=
2022-09-17T20:00:18.794Z [INFO] agent.server: Adding LAN server: server="82e5134c3a81 (Addr: tcp/10.0.1.15:8300) (DC: dc1)"
2022-09-17T20:00:18.806Z [INFO] agent: started state syncer
2022-09-17T20:00:18.806Z [INFO] agent: Consul agent running!
2022-09-17T20:00:18.807Z [INFO] agent: Retry join is supported for the following discovery methods: cluster=LAN discovery_methods="aliyun aws azure digitalocean gce k8s linode mdns os packet scaleway softlayer tencentcloud triton vsphere"
2022-09-17T20:00:18.807Z [INFO] agent: Joining cluster...: cluster=LAN
2022-09-17T20:00:18.807Z [INFO] agent: (LAN) joining: lan_addresses=[consul_server]
2022-09-17T20:00:18.812Z [INFO] agent.server.serf.lan: serf: EventMemberJoin: 728c5555b1d0 10.0.1.16
2022-09-17T20:00:18.813Z [INFO] agent: (LAN) joined: number_of_nodes=2
2022-09-17T20:00:18.813Z [INFO] agent: Join cluster completed. Synced with initial agents: cluster=LAN num_agents=2
2022-09-17T20:00:18.813Z [INFO] agent.server: Adding LAN server: server="728c5555b1d0 (Addr: tcp/10.0.1.16:8300) (DC: dc1)"
2022-09-17T20:00:25.643Z [WARN] agent.server.raft: no known peers, aborting election
2022-09-17T20:00:25.891Z [ERROR] agent.anti_entropy: failed to sync remote state: error="No cluster leader"
2022-09-17T20:00:30.303Z [INFO] agent.server.serf.lan: serf: EventMemberJoin: 8523d9956b78 10.0.1.4
2022-09-17T20:00:30.370Z [INFO] agent.server.serf.lan: serf: EventMemberJoin: 465d55d8d2e8 10.0.1.14
2022-09-17T20:00:30.370Z [INFO] agent.server: Adding LAN server: server="465d55d8d2e8 (Addr: tcp/10.0.1.14:8300) (DC: dc1)"
2022-09-17T20:00:30.380Z [INFO] agent.server: Found expected number of peers, attempting bootstrap: peers=10.0.1.15:8300,10.0.1.16:8300,10.0.1.14:8300
2022-09-17T20:00:31.692Z [INFO] agent.server: New leader elected: payload=728c5555b1d0
2022-09-17T20:00:40.012Z [WARN] agent: Syncing node info failed.: error="rpc error making call: i/o deadline reached"
2022-09-17T20:00:40.012Z [ERROR] agent.anti_entropy: failed to sync remote state: error="rpc error making call: i/o deadline reached"
2022-09-17T20:00:40.955Z [INFO] agent.server.serf.lan: serf: EventMemberJoin: 84b80437a8af 10.0.1.6
2022-09-17T20:00:43.822Z [INFO] agent.server.serf.lan: serf: EventMemberJoin: ddd0f09a4f97 10.0.1.5
2022-09-17T20:00:43.862Z [INFO] agent.server.serf.lan: serf: EventMemberJoin: ea8ef0287938 10.0.1.7
2022-09-17T20:00:44.340Z [INFO] agent.server.serf.lan: serf: EventMemberJoin: df3ebf259660 10.0.1.2
2022-09-17T20:00:54.484Z [INFO] agent.server.serf.lan: serf: EventMemberJoin: 577e36c1936c 10.0.1.3
2022-09-17T20:01:04.314Z [ERROR] agent.server.memberlist.lan: memberlist: failed to receive and remove the stream label header: read tcp 10.0.1.15:8301->10.0.1.2:53122: i/o timeout from=<unknown address>
2022-09-17T20:01:12.386Z [WARN] agent: Syncing node info failed.: error="rpc error making call: i/o deadline reached"
2022-09-17T20:01:12.386Z [ERROR] agent.anti_entropy: failed to sync remote state: error="rpc error making call: i/o deadline reached"
2022-09-17T20:01:17.787Z [ERROR] agent.server.memberlist.lan: memberlist: failed to receive and remove the stream label header: read tcp 10.0.1.15:8301->10.0.1.3:54666: i/o timeout from=<unknown address>
2022-09-17T20:01:23.721Z [ERROR] agent.server.memberlist.lan: memberlist: Push/Pull with 8523d9956b78 failed: read tcp 10.0.1.15:57786->10.0.1.4:8301: i/o timeout
2022-09-17T20:01:24.868Z [ERROR] agent.server.memberlist.lan: memberlist: failed to receive and remove the stream label header: read tcp 10.0.1.15:8301->10.0.1.16:58220: i/o timeout from=<unknown address>
2022-09-17T20:01:40.341Z [WARN] agent: Syncing node info failed.: error="rpc error making call: i/o deadline reached"
2022-09-17T20:01:40.341Z [ERROR] agent.anti_entropy: failed to sync remote state: error="rpc error making call: i/o deadline reached"
2022-09-17T20:02:03.725Z [ERROR] agent.server.memberlist.lan: memberlist: Push/Pull with 465d55d8d2e8 failed: read tcp 10.0.1.15:47200->10.0.1.14:8301: i/o timeout
2022-09-17T20:02:05.898Z [WARN] agent: Syncing node info failed.: error="rpc error making call: i/o deadline reached"
2022-09-17T20:02:05.898Z [ERROR] agent.anti_entropy: failed to sync remote state: error="rpc error making call: i/o deadline reached"
And finally the log for one of the six agents (acquired running sudo docker service logs consul_agent
):
==> Using /run/secrets/consul_ca and /run/secrets/consul_ca_key
==> Saved dc1-client-consul-0.pem
==> Saved dc1-client-consul-0-key.pem
==> Starting Consul agent...
Version: '1.13.1'
Build Date: '2022-08-11 19:07:00 +0000 UTC'
Node ID: 'dda25a31-bec5-c12d-83e9-7101b8e0873d'
Node name: '84b80437a8af'
Datacenter: 'dc1' (Segment: '')
Server: false (Bootstrap: false)
Client Addr: [0.0.0.0] (HTTP: -1, HTTPS: -1, gRPC: -1, DNS: 8600)
Cluster Addr: 10.0.1.6 (LAN: 8301, WAN: -1)
Encrypt: Gossip: true, TLS-Outgoing: true, TLS-Incoming: true, Auto-Encrypt-TLS: false
consul_agent.0.tor7030dv851@cloudsrv-1 |
==> Log data will now stream in as it occurs:
consul_agent.0.tor7030dv851@cloudsrv-1 |
2022-09-17T20:00:10.920Z [INFO] agent.client.serf.lan: serf: EventMemberJoin: 84b80437a8af 10.0.1.6
2022-09-17T20:00:10.920Z [INFO] agent.router: Initializing LAN area manager
2022-09-17T20:00:10.924Z [INFO] agent: Started DNS server: address=0.0.0.0:8600 network=udp
2022-09-17T20:00:10.925Z [INFO] agent: Started DNS server: address=0.0.0.0:8600 network=tcp
2022-09-17T20:00:10.927Z [INFO] agent: started state syncer
2022-09-17T20:00:10.927Z [INFO] agent: Consul agent running!
2022-09-17T20:00:10.928Z [INFO] agent: Retry join is supported for the following discovery methods: cluster=LAN discovery_methods="aliyun aws azure digitalocean gce k8s linode mdns os packet scaleway softlayer tencentcloud triton vsphere"
2022-09-17T20:00:10.928Z [INFO] agent: Joining cluster...: cluster=LAN
2022-09-17T20:00:10.928Z [INFO] agent: (LAN) joining: lan_addresses=[consul_server]
2022-09-17T20:00:10.931Z [WARN] agent.router.manager: No servers available
2022-09-17T20:00:10.931Z [ERROR] agent.anti_entropy: failed to sync remote state: error="No known Consul servers"
2022-09-17T20:00:10.950Z [WARN] agent.client.memberlist.lan: memberlist: Failed to resolve consul_server: lookup consul_server on 127.0.0.11:53: no such host
2022-09-17T20:00:10.950Z [WARN] agent: (LAN) couldn't join: number_of_nodes=0 error="1 error occurred:
* Failed to resolve consul_server: lookup consul_server on 127.0.0.11:53: no such host
consul_agent.0.tor7030dv851@cloudsrv-1 |
"
2022-09-17T20:00:10.950Z [WARN] agent: Join cluster failed, will retry: cluster=LAN retry_interval=30s error=<nil>
2022-09-17T20:00:33.170Z [WARN] agent.router.manager: No servers available
2022-09-17T20:00:33.170Z [ERROR] agent.anti_entropy: failed to sync remote state: error="No known Consul servers"
2022-09-17T20:00:40.951Z [INFO] agent: (LAN) joining: lan_addresses=[consul_server]
2022-09-17T20:00:40.956Z [INFO] agent.client.serf.lan: serf: EventMemberJoin: 465d55d8d2e8 10.0.1.14
2022-09-17T20:00:40.956Z [INFO] agent.client.serf.lan: serf: EventMemberJoin: 8523d9956b78 10.0.1.4
2022-09-17T20:00:40.956Z [INFO] agent.client.serf.lan: serf: EventMemberJoin: 82e5134c3a81 10.0.1.15
2022-09-17T20:00:40.956Z [INFO] agent.client.serf.lan: serf: EventMemberJoin: 728c5555b1d0 10.0.1.16
2022-09-17T20:00:40.957Z [INFO] agent.client: adding server: server="465d55d8d2e8 (Addr: tcp/10.0.1.14:8300) (DC: dc1)"
2022-09-17T20:00:40.957Z [INFO] agent.client: adding server: server="82e5134c3a81 (Addr: tcp/10.0.1.15:8300) (DC: dc1)"
2022-09-17T20:00:40.957Z [INFO] agent.client: adding server: server="728c5555b1d0 (Addr: tcp/10.0.1.16:8300) (DC: dc1)"
2022-09-17T20:00:40.966Z [INFO] agent: (LAN) joined: number_of_nodes=3
2022-09-17T20:00:40.966Z [INFO] agent: Join cluster completed. Synced with initial agents: cluster=LAN num_agents=3
2022-09-17T20:00:43.923Z [INFO] agent.client.serf.lan: serf: EventMemberJoin: ddd0f09a4f97 10.0.1.5
2022-09-17T20:00:43.923Z [INFO] agent.client.serf.lan: serf: EventMemberJoin: ea8ef0287938 10.0.1.7
2022-09-17T20:00:44.388Z [INFO] agent.client.serf.lan: serf: EventMemberJoin: df3ebf259660 10.0.1.2
2022-09-17T20:00:44.690Z [INFO] agent.client.serf.lan: serf: EventMemberJoin: 577e36c1936c 10.0.1.3
2022-09-17T20:00:50.621Z [ERROR] agent.client: RPC failed to server: method=Catalog.Register server=10.0.1.14:8300 error="rpc error making call: i/o deadline reached"
2022-09-17T20:00:50.621Z [WARN] agent: Syncing node info failed.: error="rpc error making call: i/o deadline reached"
2022-09-17T20:00:50.622Z [ERROR] agent.anti_entropy: failed to sync remote state: error="rpc error making call: i/o deadline reached"
2022-09-17T20:00:59.283Z [ERROR] agent.client: RPC failed to server: method=Catalog.Register server=10.0.1.15:8300 error="rpc error making call: i/o deadline reached"
2022-09-17T20:00:59.283Z [WARN] agent: Syncing node info failed.: error="rpc error making call: i/o deadline reached"
2022-09-17T20:00:59.283Z [ERROR] agent.anti_entropy: failed to sync remote state: error="rpc error making call: i/o deadline reached"
2022-09-17T20:01:11.606Z [ERROR] agent.client.memberlist.lan: memberlist: Push/Pull with ddd0f09a4f97 failed: read tcp 10.0.1.6:57822->10.0.1.5:8301: i/o timeout
2022-09-17T20:01:13.166Z [ERROR] agent.client.memberlist.lan: memberlist: failed to receive and remove the stream label header: read tcp 10.0.1.6:8301->10.0.1.2:55424: i/o timeout from=<unknown address>
2022-09-17T20:01:13.445Z [ERROR] agent.client.memberlist.lan: memberlist: failed to receive and remove the stream label header: read tcp 10.0.1.6:8301->10.0.1.5:44266: i/o timeout from=<unknown address>
2022-09-17T20:01:16.436Z [ERROR] agent.client.memberlist.lan: memberlist: failed to receive and remove the stream label header: read tcp 10.0.1.6:8301->10.0.1.7:59768: i/o timeout from=<unknown address>
2022-09-17T20:01:23.510Z [ERROR] agent.client: RPC failed to server: method=Catalog.Register server=10.0.1.16:8300 error="rpc error making call: i/o deadline reached"
2022-09-17T20:01:23.510Z [WARN] agent: Syncing node info failed.: error="rpc error making call: i/o deadline reached"
2022-09-17T20:01:23.510Z [ERROR] agent.anti_entropy: failed to sync remote state: error="rpc error making call: i/o deadline reached"
2022-09-17T20:01:50.308Z [ERROR] agent.client: RPC failed to server: method=Catalog.Register server=10.0.1.14:8300 error="rpc error making call: i/o deadline reached"
2022-09-17T20:01:50.308Z [WARN] agent: Syncing node info failed.: error="rpc error making call: i/o deadline reached"
2022-09-17T20:01:50.308Z [ERROR] agent.anti_entropy: failed to sync remote state: error="rpc error making call: i/o deadline reached"
2022-09-17T20:01:51.609Z [ERROR] agent.client.memberlist.lan: memberlist: Push/Pull with 465d55d8d2e8 failed: read tcp 10.0.1.6:37694->10.0.1.14:8301: i/o timeout
2022-09-17T20:02:21.584Z [ERROR] agent.client: RPC failed to server: method=Catalog.Register server=10.0.1.14:8300 error="rpc error making call: i/o deadline reached"
2022-09-17T20:02:21.584Z [WARN] agent: Syncing node info failed.: error="rpc error making call: i/o deadline reached"
2022-09-17T20:02:21.584Z [ERROR] agent.anti_entropy: failed to sync remote state: error="rpc error making call: i/o deadline reached"