Hello,
I’m trying to deploy multiple architectures with Consul open source. What I was able to do so far (in different deployments)
- a single datacenter setup with ACL enabled
- a multi datacenter setup with ACL disabled
- a multi datacenter setup with ACL enabled
In deployment 3, the error happens when I try to join the secondary cluster to the primary datacenter. This is the CLI I’m using from the node of the secondary dc:
consul join -token=“XXXXX” -wan <public_ip_of_node_in_primary_dc>
The token I’m using in the above CLI is the initial_management I generated when boostrapping ACLs in the primary datacenter.
The error I’m getting is the following:
Error joining address ‘<public_ip>’: Unexpected response code: 403 (Permission denied: token with AccessorID ‘primary-dc-down’ lacks permission ‘agent:write’ on “test-dlo”)
Failed to join any nodes.
Strangely enough, when tailing the logs from the primary dc node, I cannot see any request incoming, so I believe the 403 error is coming from the secondary datacenter itself.
The config files I’m using are the following:
Primary datacenter:
{
"datacenter": "minus3-europe",
"data_dir": "/consuldata",
"node_name": "ConsulServer-10-0-2-142",
"server": true,
"bootstrap_expect": 3,
"advertise_addr": "10.0.2.142",
"advertise_addr_wan": "<public_ip>",
"leave_on_terminate": true,
"reconnect_timeout": "8h",
"reconnect_timeout_wan": "8h",
"retry_join": ["provider=aws tag_key=ConsulAutoJoinSecret tag_value=17d651ad-dfb2-abeb-30c3-81621dd65a17"],
"log_file": "/consuldata/",
"log_level": "DEBUG",
"log_rotate_duration": "24h",
"log_rotate_max_files": 7,
"ui_config": {
"enabled": true
},
"bind_addr": "0.0.0.0",
"addresses": {
"http": "0.0.0.0"
},
"acl": {
"enabled": true,
"default_policy": "deny",
"enable_token_persistence": true,
"enable_token_replication": true,
"tokens": {
"initial_management": "XXXXX"
}
},
"primary_datacenter": "minus3-europe"
}
Secondary DC:
{
"datacenter": "minus3-us",
"primary_datacenter": "minus3-europe",
"data_dir": "/consuldata",
"node_name": "test-dlo",
"server": true,
"bootstrap_expect": 1,
"advertise_addr": "172.31.7.62",
"advertise_addr_wan": "<public_ip>",
"leave_on_terminate": true,
"reconnect_timeout": "8h",
"reconnect_timeout_wan": "8h",
"retry_join": ["provider=aws tag_key=Name tag_value=TEstDLO"],
"log_file": "/consuldata/",
"log_level": "DEBUG",
"log_rotate_duration": "24h",
"log_rotate_max_files": 7,
"ui_config": {
"enabled": true
},
"bind_addr": "0.0.0.0",
"addresses": {
"http": "0.0.0.0"
},
"acl": {
"enabled": true,
"default_policy": "deny",
"down_policy": "deny",
"enable_token_persistence": true,
"enable_token_replication": true
}
}
Besides the error I get when trying to join the WAN, the log file of the node in the secondary DC is polluted with the following:
2022-09-06T11:01:08.461Z [WARN] agent.server.rpc: RPC request for DC is currently failing as no path was found: datacenter=minus3-europe method=ACL.TokenRead
2022-09-06T11:01:08.462Z [ERROR] agent.acl: Error resolving token: error=“Error communicating with the ACL Datacenter: No path to datacenter”
2022-09-06T11:01:08.462Z [WARN] agent.server.rpc: RPC request for DC is currently failing as no path was found: datacenter=minus3-europe method=ACL.TokenRead
2022-09-06T11:01:08.462Z [ERROR] agent.acl: Error resolving token: error=“Error communicating with the ACL Datacenter: No path to datacenter”
2022-09-06T11:01:08.462Z [WARN] agent: Coordinate update blocked by ACLs: accessorID=primary-dc-down
2022-09-06T11:01:10.815Z [DEBUG] agent.server: federation states are not enabled in the primary dc
2022-09-06T11:01:15.815Z [DEBUG] agent.server: federation states are not enabled in the primary dc
2022-09-06T11:01:20.815Z [DEBUG] agent.server: federation states are not enabled in the primary dc
2022-09-06T11:01:25.815Z [DEBUG] agent.server: federation states are not enabled in the primary dc
2022-09-06T11:01:27.613Z [WARN] agent.server.rpc: RPC request for DC is currently failing as no path was found: datacenter=minus3-europe method=ACL.TokenRead
2022-09-06T11:01:27.614Z [ERROR] agent.acl: Error resolving token: error=“Error communicating with the ACL Datacenter: No path to datacenter”
2022-09-06T11:01:27.614Z [WARN] agent.server.rpc: RPC request for DC is currently failing as no path was found: datacenter=minus3-europe method=ACL.TokenRead
2022-09-06T11:01:27.614Z [ERROR] agent.acl: Error resolving token: error=“Error communicating with the ACL Datacenter: No path to datacenter”
2022-09-06T11:01:27.614Z [WARN] agent: Coordinate update blocked by ACLs: accessorID=primary-dc-down
I understand that after joining the WAN I still need to configure the replication tokens in the secondary datacenter, but I think that is the next step right? First and foremost, I need the secondary datacenter to properly join the WAN, but I cannot make progress from here.
Any ideas of what am I missing?
Thanks,
David