Servers are able to join existing cluster without TLS or an ACL Write policy

Afternoon,

I suspect I have a config option missing somewhere or something, but I would not expect consul agents to be able to join a cluster that has Gossip Encryption, mTLS enabled and ACLs enforced without all of the necessary configuration?

Currently, servers have the following (relevant) configuration

acl {
        enabled = true
        default_policy = "deny"
        down_policy = "extend-cache"
        enable_token_persistence = true

        tokens {
        }
}
encrypt = "<consul keygen>"
encrypt_verify_incoming = true
encrypt_verify_outgoing = true
verify_incoming = true
verify_outgoing = true
verify_server_hostname = true

ca_file = "/etc/consul.d/tls/consul-agent-ca.pem"
cert_file = "/etc/consul.d/tls/server-consul-0.pem"
key_file = "/etc/consul.d/tls/server-consul-0-key.pem"
auto_encrypt {
        allow_tls = true
}

If I spin up a new server configuration, with the gossip encryption key set correctly and completely omit the TLS/ACL configuration, then the server is able to join the cluster and can read the whole configuration without an ACL token - it’s also able to register itself in the consul service… the only ACL policies that exist with a write policy are for explicitly named hosts…

[root@LINUX-T-APP01 tmp]$ ./consul join consul-p1-srv01.exe.nhs.uk
Successfully joined cluster by contacting 1 nodes.
[root@LINUX-T-APP01 tmp]$ ./consul members
Node             Address               Status  Type    Build  Protocol  DC   Segment
CONSUL-P1-SRV01  192.168.102.30:8301   alive   server  1.8.5  2         rde  <all>
CONSUL-P2-SRV01  192.168.102.31:8301   alive   server  1.8.5  2         rde  <all>
LINUX-T-APP01    192.168.100.80:8301   alive   server  1.8.4  2         rde  <all>
NOMAD-P1-SRV01   192.168.102.241:8301  alive   client  1.8.5  2         rde  <default>
NOMAD-P2-SRV01   192.168.102.242:8301  alive   client  1.8.5  2         rde  <default>
NOMAD-PY-SRV01   192.168.102.35:8301   alive   client  1.8.5  2         rde  <default>

Dump question, but did you bootstrap the acl system?

Given my description, I’d say it’s a reasonable question, but yes, I did bootstrap it.

Maybe it’s just the way it is? when other things join with only the gossip key then they throw out a lot of errors, but those errors are the ones that I would expect to prevent it from being able to register and list members.

2020-12-03T09:55:01.440Z [ERROR] agent.client: RPC failed to server: method=Catalog.NodeServiceList server=192.168.102.31:8300 error="rpc error making call: EOF"
2020-12-03T09:55:01.440Z [ERROR] agent.anti_entropy: failed to sync remote state: error="rpc error making call: EOF"
2020-12-03T09:55:18.120Z [ERROR] agent.client: RPC failed to server: method=Coordinate.Update server=192.168.102.30:8300 error="rpc error making call: EOF"
2020-12-03T09:55:18.120Z [ERROR] agent: Coordinate update error: error="rpc error making call: EOF"