For some reason I don’t understand I am getting the following error message when a client Consul agent attempts to join a secondary datacenter:
agent: AutoEncrypt failed: rpcinsecure error making call: rpcinsecure error making call: internal error: CA root is nil
The equivalent Consul configuration works in the primary datacenter.
Is this a problem with my configuration, or a limitation/bug of Consul?
Configuration overview:
- Consul community edition version 1.6.1
- Two Consul datacenters, one Consul domain
- Using cloud auto-join (AWS).
- Secondary datacenter Consul servers join the cluster successfully with healthy status
- Secondary datacenter Consul servers have retry_join configured for secondary datacenter and retry_join_wan configured for primary datacenter
- Failing client agent has retry_join configured for secondary datacenter (and no retry_join_wan)
- Failing client agent is in the same subnet as the secondary datacenter Consul servers, with the same security groups applied
- Gossip encryption enabled
- TLS enabled/enforced
- CA cert installed and configured on all hosts
- ACL enabled/enforced
- ACL persistence/replication enabled
Here are the exact consul.hcl configuration files I am using on the various nodes:
- Failing client agent in secondary datacenter: backend.shi2.hcl.txt (1.7 KB)
- Successful client agent in primary datacenter: frontend.shi1.hcl.txt (1.7 KB)
- Server in primary datacenter: server.shi1.hcl.txt (1.9 KB)
- Server in secondary datacenter: server.shi2.hcl.txt (1.9 KB)