Hi @lkysow I got passed this step.
Somehow if I deploy the entire configuration with --wait, it throws a timeout error.
Hence, I first deployed the server with federation disabled, and once the server pod was up and running, I upgraded the configuration with federation enabled and gossip encryption.
Now that I have created 2 datacenters dc1 and dc2, it is not able to join the cluster.
Below is the output from consul server of dc2.
2021-06-17T06:26:31.923Z [WARN] agent: BootstrapExpect is set to 1; this is the same as Bootstrap mode.
2021-06-17T06:26:31.923Z [WARN] agent: bootstrap = true: do not enable unless necessary
2021-06-17T06:26:32.127Z [WARN] agent.auto_config: BootstrapExpect is set to 1; this is the same as Bootstrap mode.
2021-06-17T06:26:32.127Z [WARN] agent.auto_config: bootstrap = true: do not enable unless necessary
2021-06-17T06:26:32.224Z [INFO] agent.server.gateway_locator: will dial the primary datacenter using our local mesh gateways if possible
2021-06-17T06:26:32.256Z [INFO] agent.server.raft: initial configuration: index=12 servers="[{Suffrage:Voter ID:9a1199a6-4f73-797e-7f3d-11aa6e6c2d58 Address:10.8.6.35:8300}]"
2021-06-17T06:26:32.256Z [INFO] agent.server.raft: entering follower state: follower=“Node at 10.8.6.43:8300 [Follower]” leader=
2021-06-17T06:26:32.323Z [INFO] agent.server.serf.wan: serf: EventMemberJoin: consul-server-0.dc2 10.8.6.43
2021-06-17T06:26:32.323Z [INFO] agent.server.serf.wan: serf: Attempting re-join to previously known node: consul-server-0.dc1: 10.8.6.33:8302
2021-06-17T06:26:32.323Z [WARN] agent.server.serf.wan: serf: Failed to re-join any previously known node
2021-06-17T06:26:32.324Z [INFO] agent.server.serf.lan: serf: EventMemberJoin: consul-server-0 10.8.6.43
2021-06-17T06:26:32.324Z [INFO] agent.router: Initializing LAN area manager
2021-06-17T06:26:32.324Z [INFO] agent.server.serf.lan: serf: Attempting re-join to previously known node: gke-test-cluster-pool-3-7fc2780c-vm5f: 10.8.6.37:8301
2021-06-17T06:26:32.324Z [INFO] agent: Started DNS server: address=0.0.0.0:8600 network=udp
2021-06-17T06:26:32.324Z [INFO] agent.server: Adding LAN server: server=“consul-server-0 (Addr: tcp/10.8.6.43:8300) (DC: dc2)”
2021-06-17T06:26:32.324Z [INFO] agent.server: Handled event for server in area: event=member-join server=consul-server-0.dc2 area=wan
2021-06-17T06:26:32.325Z [INFO] agent: Started DNS server: address=0.0.0.0:8600 network=tcp
2021-06-17T06:26:32.325Z [INFO] agent.server.serf.lan: serf: Attempting re-join to previously known node: gke-test-cluster-pool-3-7fc2780c-grt7: 10.8.4.18:8301
2021-06-17T06:26:32.328Z [INFO] agent: Starting server: address=[::]:8501 network=tcp protocol=https
2021-06-17T06:26:32.328Z [WARN] agent: DEPRECATED Backwards compatibility with pre-1.9 metrics enabled. These metrics will be removed in a future version of Consul. Set telemetry { disable_compat_1.9 = true }
to disable them.
2021-06-17T06:26:32.328Z [WARN] agent.server.serf.lan: serf: Failed to re-join any previously known node
2021-06-17T06:26:32.328Z [INFO] agent: started state syncer
==> Consul agent running!
2021-06-17T06:26:32.328Z [INFO] agent: Refreshing mesh gateways is supported for the following discovery methods: discovery_methods=“aliyun aws azure digitalocean gce k8s linode mdns os packet scaleway softlayer tencentcloud triton vsphere”
2021-06-17T06:26:32.328Z [INFO] agent: Refreshing mesh gateways…
2021-06-17T06:26:32.328Z [INFO] agent.server.gateway_locator: updated fallback list of primary mesh gateways: mesh_gateways=[34.93.181.62:443]
2021-06-17T06:26:32.328Z [INFO] agent: Refreshing mesh gateways completed
2021-06-17T06:26:32.329Z [INFO] agent: Retry join is supported for the following discovery methods: cluster=WAN discovery_methods=“aliyun aws azure digitalocean gce k8s linode mdns os packet scaleway softlayer tencentcloud triton vsphere”
2021-06-17T06:26:32.329Z [INFO] agent: Joining cluster…: cluster=WAN
2021-06-17T06:26:32.329Z [INFO] agent: (WAN) joining: wan_addresses=[*.dc1/192.0.2.2]
2021-06-17T06:26:32.329Z [WARN] agent: (WAN) couldn’t join: number_of_nodes=0 error="1 error occurred:
* Failed to join 192.0.2.2: Remote DC has no server currently reachable
"
2021-06-17T06:26:32.329Z [WARN] agent: Join cluster failed, will retry: cluster=WAN retry_interval=30s error=
2021-06-17T06:26:32.328Z [INFO] agent: Retry join is supported for the following discovery methods: cluster=LAN discovery_methods=“aliyun aws azure digitalocean gce k8s linode mdns os packet scaleway softlayer tencentcloud triton vsphere”
2021-06-17T06:26:32.329Z [INFO] agent: Joining cluster…: cluster=LAN
2021-06-17T06:26:32.329Z [INFO] agent: (LAN) joining: lan_addresses=[consul-server-0.consul-server.default.svc:8301]
2021-06-17T06:26:32.422Z [INFO] agent: (LAN) joined: number_of_nodes=1
2021-06-17T06:26:32.422Z [INFO] agent: Join cluster completed. Synced with initial agents: cluster=LAN num_agents=1
2021-06-17T06:26:37.953Z [WARN] agent.server.raft: heartbeat timeout reached, starting election: last-leader=
2021-06-17T06:26:37.953Z [INFO] agent.server.raft: entering candidate state: node=“Node at 10.8.6.43:8300 [Candidate]” term=3
2021-06-17T06:26:37.958Z [INFO] agent.server.raft: election won: tally=1
2021-06-17T06:26:37.958Z [INFO] agent.server.raft: entering leader state: leader=“Node at 10.8.6.43:8300 [Leader]”
2021-06-17T06:26:37.958Z [INFO] agent.server: cluster leadership acquired
2021-06-17T06:26:37.958Z [INFO] agent.server: New leader elected: payload=consul-server-0
2021-06-17T06:26:38.227Z [INFO] agent: Synced node info
2021-06-17T06:26:38.227Z [ERROR] agent.server.autopilot: Error when computing next state: error=“cannot detect the current leader server id from its address: 10.8.6.43:8300”
2021-06-17T06:26:38.227Z [INFO] agent.leader: started routine: routine=“config entry replication”
2021-06-17T06:26:38.227Z [INFO] agent.leader: started routine: routine=“federation state replication”
2021-06-17T06:26:38.227Z [INFO] agent.leader: started routine: routine=“federation state anti-entropy”
2021-06-17T06:26:38.227Z [WARN] agent.server.connect: primary datacenter is configured but unreachable - deferring initialization of the secondary datacenter CA
2021-06-17T06:26:38.227Z [INFO] agent.leader: started routine: routine=“secondary CA roots watch”
2021-06-17T06:26:38.227Z [INFO] agent.leader: started routine: routine=“intermediate cert renew watch”
2021-06-17T06:26:38.227Z [INFO] agent.leader: started routine: routine=“CA root pruning”
2021-06-17T06:26:38.227Z [INFO] agent.server.raft: updating configuration: command=AddStaging server-id=9a1199a6-4f73-797e-7f3d-11aa6e6c2d58 server-addr=10.8.6.43:8300 servers="[{Suffrage:Voter ID:9a1199a6-4f73-797e-7f3d-11aa6e6c2d58 Address:10.8.6.43:8300}]"
2021-06-17T06:26:38.228Z [WARN] agent.server.rpc: RPC request for DC is currently failing as no path was found: datacenter=dc1 method=ConfigEntry.ListAll
2021-06-17T06:26:38.228Z [WARN] agent.server.replication.config_entry: replication error (will retry if still leader): error=“failed to retrieve remote config entries: No path to datacenter”
2021-06-17T06:26:38.228Z [WARN] agent.server.rpc: RPC request for DC is currently failing as no path was found: datacenter=dc1 method=FederationState.List
2021-06-17T06:26:38.228Z [WARN] agent.server.replication.federation_state: replication error (will retry if still leader): error=“failed to retrieve federation states: No path to datacenter”
2021-06-17T06:26:38.228Z [WARN] agent.server.rpc: RPC request for DC is currently failing as no path was found: datacenter=dc1 method=ConnectCA.Roots
2021-06-17T06:26:38.228Z [ERROR] agent.server.connect: CA root replication failed, will retry: routine=“secondary CA roots watch” error=“Error retrieving the primary datacenter’s roots: No path to datacenter”
2021-06-17T06:26:38.229Z [INFO] agent.server: deregistering member: member=gke-test-cluster-pool-3-7fc2780c-grt7 reason=reaped
2021-06-17T06:26:38.231Z [INFO] agent.server: deregistering member: member=gke-test-cluster-pool-3-7fc2780c-vm5f reason=reaped
2021-06-17T06:26:38.231Z [WARN] agent.server.rpc: RPC request for DC is currently failing as no path was found: datacenter=dc1 method=FederationState.Apply
2021-06-17T06:26:38.232Z [ERROR] agent.server: error performing anti-entropy sync of federation state: error=“error performing federation state anti-entropy sync: No path to datacenter”
2021-06-17T06:26:39.229Z [WARN] agent.server.rpc: RPC request for DC is currently failing as no path was found: datacenter=dc1 method=ConfigEntry.ListAll
2021-06-17T06:26:39.229Z [WARN] agent.server.replication.config_entry: replication error (will retry if still leader): error=“failed to retrieve remote config entries: No path to datacenter”
2021-06-17T06:26:39.301Z [WARN] agent.server.rpc: RPC request for DC is currently failing as no path was found: datacenter=dc1 method=FederationState.List
2021-06-17T06:26:39.301Z [WARN] agent.server.replication.federation_state: replication error (will retry if still leader): error=“failed to retrieve federation states: No path to datacenter”
2021-06-17T06:26:40.228Z [WARN] agent.server.rpc: RPC request for DC is currently failing as no path was found: datacenter=dc1 method=ConnectCA.Roots
2021-06-17T06:26:40.228Z [ERROR] agent.server.connect: CA root replication failed, will retry: routine=“secondary CA roots watch” error=“Error retrieving the primary datacenter’s roots: No path to datacenter”
2021-06-17T06:26:40.232Z [WARN] agent.server.rpc: RPC request for DC is currently failing as no path was found: datacenter=dc1 method=FederationState.Apply
2021-06-17T06:26:40.232Z [ERROR] agent.server: error performing anti-entropy sync of federation state: error=“error performing federation state anti-entropy sync: No path to datacenter”
2021-06-17T06:26:41.259Z [WARN] agent.server.rpc: RPC request for DC is currently failing as no path was found: datacenter=dc1 method=ConfigEntry.ListAll
2021-06-17T06:26:41.259Z [WARN] agent.server.replication.config_entry: replication error (will retry if still leader): error=“failed to retrieve remote config entries: No path to datacenter”
2021-06-17T06:26:41.453Z [WARN] agent.server.rpc: RPC request for DC is currently failing as no path was found: datacenter=dc1 method=FederationState.List
2021-06-17T06:26:41.453Z [INFO] agent.server.gateway_locator: will dial the primary datacenter through its mesh gateways
2021-06-17T06:26:41.453Z [WARN] agent.server.replication.federation_state: replication error (will retry if still leader): error=“failed to retrieve federation states: No path to datacenter”
2021-06-17T06:26:42.401Z [INFO] agent: Newer Consul version available: new_version=1.9.6 current_version=1.9.4
2021-06-17T06:26:42.795Z [INFO] agent.server.serf.lan: serf: EventMemberJoin: gke-test-cluster-pool-3-7fc2780c-grt7 10.8.4.20
2021-06-17T06:26:42.795Z [INFO] agent.server: member joined, marking health alive: member=gke-test-cluster-pool-3-7fc2780c-grt7
2021-06-17T06:26:44.228Z [WARN] agent.server.rpc: RPC request for DC is currently failing as no path was found: datacenter=dc1 method=ConnectCA.Roots
2021-06-17T06:26:44.228Z [ERROR] agent.server.connect: CA root replication failed, will retry: routine=“secondary CA roots watch” error=“Error retrieving the primary datacenter’s roots: No path to datacenter”
2021-06-17T06:26:44.232Z [WARN] agent.server.rpc: RPC request for DC is currently failing as no path was found: datacenter=dc1 method=FederationState.Apply
2021-06-17T06:26:44.232Z [ERROR] agent.server: error performing anti-entropy sync of federation state: error=“error performing federation state anti-entropy sync: No path to datacenter”
2021-06-17T06:26:45.350Z [WARN] agent.server.rpc: RPC request for DC is currently failing as no path was found: datacenter=dc1 method=ConfigEntry.ListAll
2021-06-17T06:26:45.350Z [WARN] agent.server.replication.config_entry: replication error (will retry if still leader): error=“failed to retrieve remote config entries: No path to datacenter”
2021-06-17T06:26:45.731Z [WARN] agent.server.rpc: RPC request for DC is currently failing as no path was found: datacenter=dc1 method=FederationState.List
2021-06-17T06:26:45.731Z [WARN] agent.server.replication.federation_state: replication error (will retry if still leader): error=“failed to retrieve federation states: No path to datacenter”
2021-06-17T06:26:52.228Z [WARN] agent.server.rpc: RPC request for DC is currently failing as no path was found: datacenter=dc1 method=ConnectCA.Roots
2021-06-17T06:26:52.228Z [ERROR] agent.server.connect: CA root replication failed, will retry: routine=“secondary CA roots watch” error=“Error retrieving the primary datacenter’s roots: No path to datacenter”
2021-06-17T06:26:52.232Z [WARN] agent.server.rpc: RPC request for DC is currently failing as no path was found: datacenter=dc1 method=FederationState.Apply
2021-06-17T06:26:52.232Z [ERROR] agent.server: error performing anti-entropy sync of federation state: error=“error performing federation state anti-entropy sync: No path to datacenter”
2021-06-17T06:26:53.978Z [WARN] agent.server.rpc: RPC request for DC is currently failing as no path was found: datacenter=dc1 method=FederationState.List
2021-06-17T06:26:53.978Z [WARN] agent.server.replication.federation_state: replication error (will retry if still leader): error=“failed to retrieve federation states: No path to datacenter”
2021-06-17T06:26:54.006Z [WARN] agent.server.rpc: RPC request for DC is currently failing as no path was found: datacenter=dc1 method=ConfigEntry.ListAll
2021-06-17T06:26:54.006Z [WARN] agent.server.replication.config_entry: replication error (will retry if still leader): error=“failed to retrieve remote config entries: No path to datacenter”
2021-06-17T06:26:55.271Z [INFO] agent.server.serf.lan: serf: EventMemberJoin: gke-test-cluster-pool-3-7fc2780c-vm5f 10.8.6.44
2021-06-17T06:26:55.271Z [INFO] agent.server: member joined, marking health alive: member=gke-test-cluster-pool-3-7fc2780c-vm5f
2021-06-17T06:26:55.764Z [INFO] agent.server.serf.lan: serf: EventMemberJoin: gke-test-cluster-pool-3-7fc2780c-699n 10.8.5.19
2021-06-17T06:26:55.764Z [INFO] agent.server: member joined, marking health alive: member=gke-test-cluster-pool-3-7fc2780c-699n
2021-06-17T06:27:02.329Z [INFO] agent: (WAN) joining: wan_addresses=[*.dc1/192.0.2.2]
2021-06-17T06:27:02.922Z [WARN] agent: (WAN) couldn’t join: number_of_nodes=0 error="1 error occurred:
* Failed to join 192.0.2.2: x509: certificate signed by unknown authority (possibly because of “x509: ECDSA verification failure” while trying to verify candidate authority certificate “Consul Agent CA”)
"
2021-06-17T06:27:02.922Z [WARN] agent: Join cluster failed, will retry: cluster=WAN retry_interval=30s error=
2021-06-17T06:27:08.229Z [WARN] agent.server.rpc: RPC request for DC is currently failing as no path was found: datacenter=dc1 method=ConnectCA.Roots
2021-06-17T06:27:08.229Z [ERROR] agent.server.connect: CA root replication failed, will retry: routine=“secondary CA roots watch” error=“Error retrieving the primary datacenter’s roots: No path to datacenter”
2021-06-17T06:27:08.233Z [WARN] agent.server.rpc: RPC request for DC is currently failing as no path was found: datacenter=dc1 method=FederationState.Apply
2021-06-17T06:27:08.233Z [ERROR] agent.server: error performing anti-entropy sync of federation state: error=“error performing federation state anti-entropy sync: No path to datacenter”
2021-06-17T06:27:10.503Z [WARN] agent.server.rpc: RPC request for DC is currently failing as no path was found: datacenter=dc1 method=FederationState.List
2021-06-17T06:27:10.503Z [WARN] agent.server.replication.federation_state: replication error (will retry if still leader): error=“failed to retrieve federation states: No path to datacenter”
2021-06-17T06:27:10.607Z [WARN] agent.server.rpc: RPC request for DC is currently failing as no path was found: datacenter=dc1 method=ConfigEntry.ListAll
2021-06-17T06:27:10.607Z [WARN] agent.server.replication.config_entry: replication error (will retry if still leader): error=“failed to retrieve remote config entries: No path to datacenter”
2021-06-17T06:27:32.922Z [INFO] agent: (WAN) joining: wan_addresses=[*.dc1/192.0.2.2]
2021-06-17T06:27:33.425Z [WARN] agent: (WAN) couldn’t join: number_of_nodes=0 error="1 error occurred:
* Failed to join 192.0.2.2: x509: certificate signed by unknown authority (possibly because of “x509: ECDSA verification failure” while trying to verify candidate authority certificate “Consul Agent CA”)
Here is my values.yaml for dc2.
global:
name: consul
datacenter: dc2
federation:
enabled: true
createFederationSecret: true
tls:
enabled: true
caCert:
secretName: consul-federation
secretKey: caCert
caKey:
secretName: consul-federation
secretKey: caKey
gossipEncryption:
secretName: consul-federation
secretKey: gossipEncryptionKey
server:
replicas: 1
extraVolumes:
- type: secret
name: consul-federation
items:
- key: serverConfigJSON
path: config.json
load: true
ui:
service:
type: ‘LoadBalancer’
enabled: true
meshGateway:
enabled: true
replicas: 1
connectInject:
enabled: true
controller:
enabled: true
On checking the proxydefault status on dc2; the synced status is empty.
kubectl get proxydefaults global
NAME SYNCED LAST SYNCED AGE
global 56s