hey all,
so, I’m trying to get a production ready consul config setup, but it seems like consul has added a lot of functionality since the last time I looked at it, and the docs are not real clear on what a production ready config should look like. (same for nomad and vault these days, really)
does Hashi have example, production-ready configs, that include acl policies and such? because following the docs as is doesn’t seem to work, I’m seeing some weird things. for instance, here’s my agent token policy:
acl = "read"
agent_prefix "" {
policy = "read"
}
node_prefix "" {
policy = "read"
}
service_prefix "" {
policy = "read"
}
session_prefix "" {
policy = "read"
}
node "{{ grains.host }}" {
policy = "write"
}
agent "{{ grains.host }}" {
policy = "write"
}
the datacenter deployment guide shows much more open permissions, and the access control setup tutorial shows even less permissions, but says that for production usage you should have “exact-match node rules”, so that’s what I’m attempting to do here.
however, when I use this policy attached to my agent token, I get the following errors in my logs, and I things like service registration using CONSUL_HTTP_TOKEN=initial-management-token consul services register something.hcl
don’t work, even though the command returns without error.
Feb 03 00:57:47 node-0 consul[3154926]: 2023-02-03T00:57:47.764Z [ERROR] agent.http: Request error: method=GET url=/v1/acl/policy/name/agent from=127.0.0.1:56756 error="ACL not found"
Feb 03 00:57:47 node-0 consul[3154926]: agent.http: Request error: method=GET url=/v1/acl/policy/name/agent from=127.0.0.1:56756 error="ACL not found"
Feb 03 00:57:48 node-0 consul[3154926]: 2023-02-03T00:57:48.116Z [ERROR] agent.http: Request error: method=GET url=/v1/acl/policy/name/readonly from=127.0.0.1:56774 error="ACL not found"
Feb 03 00:57:48 node-0 consul[3154926]: agent.http: Request error: method=GET url=/v1/acl/policy/name/readonly from=127.0.0.1:56774 error="ACL not found"
Feb 03 00:57:48 node-0 consul[3154926]: 2023-02-03T00:57:48.813Z [ERROR] agent.http: Request error: method=GET url=/v1/acl/policy/name/nomad from=127.0.0.1:56794 error="ACL not found"
Feb 03 00:57:48 node-0 consul[3154926]: agent.http: Request error: method=GET url=/v1/acl/policy/name/nomad from=127.0.0.1:56794 error="ACL not found"
Feb 03 00:57:49 node-0 consul[3154926]: 2023-02-03T00:57:49.707Z [ERROR] agent.anti_entropy: failed to sync remote state: error="ACL not found"
Feb 03 00:57:49 node-0 consul[3154926]: agent.anti_entropy: failed to sync remote state: error="ACL not found"
so my question is: what does a production agent ACL policy look like, and where is it documented? and is there a similar documentation for basically all of a standard consul + nomad cluster?
I’ve had good experiences with other hashi stuff for a long time (terraform, packer, and even nomad in single-host configs without consul), but consul just seems incredibly opaque and incorrectly documented, so I’m hoping somewhere out there just has a working production ready example which illuminates all the issues I’ve been having.
thanks!