Non-server agent unable to join secondary datacenter with "CA root is nil" error

For some reason I don’t understand I am getting the following error message when a client Consul agent attempts to join a secondary datacenter:

agent: AutoEncrypt failed: rpcinsecure error making call: rpcinsecure error making call: internal error: CA root is nil

The equivalent Consul configuration works in the primary datacenter.

Is this a problem with my configuration, or a limitation/bug of Consul?

Configuration overview:

  • Consul community edition version 1.6.1
  • Two Consul datacenters, one Consul domain
  • Using cloud auto-join (AWS).
  • Secondary datacenter Consul servers join the cluster successfully with healthy status
  • Secondary datacenter Consul servers have retry_join configured for secondary datacenter and retry_join_wan configured for primary datacenter
  • Failing client agent has retry_join configured for secondary datacenter (and no retry_join_wan)
  • Failing client agent is in the same subnet as the secondary datacenter Consul servers, with the same security groups applied
  • Gossip encryption enabled
  • TLS enabled/enforced
  • CA cert installed and configured on all hosts
  • ACL enabled/enforced
  • ACL persistence/replication enabled

Here are the exact consul.hcl configuration files I am using on the various nodes:

Hi!

The behavior you are describing sounds a like a bug that was fixed in Consul v1.6.2 that didn’t allow auto_encrpyt to enable HTTPS on client nodes. You can read about the fix as well as the v1.6.2 changelog here: