Consul exclusive ACL policy for nodes/service in ASG


Documentation has a recommendation that every node would have an exclusive ACL policy and node/service would use it, that will assure that node/service will only manage itself and will not have access to other nodes/services.

Complication that I see here is that if run cluster or service in automation (e.g. ASG) I will need to have some way of:

  1. Create Node dedicated ACL policy
  2. Generate token based on that policy

Documentation is more oriented on manual setup.

Question would is there any practices/recommendations/setups on how automate Consul ACL policy creation? Or maybe there is something on the roadmap?

My current thoughts are following (assumption that service is in AWS and that Vault involved as well):

  1. Have a service that has Consul ACL policy “acl = write” and be able to create/revoke policy based on some identifier(s) as input (e.g. service name like “consul” and node name like “Consul-Server-10-0-0-0”). Also service has an IAM role that allows it to authenticate to vault and obtain token. Once it has token Create Policy API is triggered.

  2. When a new instance is spin up, it calls service with payload to create a Policy, once policy created, instance can use IAM + Vault to obtain Consul Agent token.

But before going this road, would love to hear your thoughts

Thank you

First, having a policy per node that grants write privileges for the node name is the ideal. This is the least privilege setup you can have. The not so nice part about this is that it requires setting up tons of policies.

In the future there could be better ways Consul could help from policy templating so it can fill in the node name for you and allow sharing a single template for all your consul nodes, or other concepts like a Node Identity similar to how we have Service Identities today. Neither of these things are on the roadmap.

If you do not require strict least-privilege level tokens for your Consul agents then you may be better off with a prefix matching rule for node names and using a common policy.

node_prefix "consul-node-" {
   policy = "write"

Using that policy will allow tokens generated with that policy to perform write operations on any node whose name starts with “consul-node-”. For many users this level of restriction is enough, but you should carefully consider whether it meets your specific security requirements.

Your process certainly could work. But it would probably involve automatically creating more Vault roles as each role gets tied to a specific set of policies. So after your step 2 you would have to create a new Vault role that uses the policy that was just created. Whether this is a good idea again depends on your security requirements.

Today, it would seem the only way forward is probably to automate some of this yourself. I would recommend open a feature request to Consul and maybe Vault as well on GitHub with your use case. The Consul parts of needing templating, or identities or some other way to solve the issue of needing many policies to do least-privilege tokens has been on my mind since implementing the new ACLs in 1.4. Vault could, however make this process much easier by being able to create Consul policies itself for roles and then generating the tokens from a policy it manages. So instead of having to predefine all of them in Consul, Vault could do it for you. That is definitely not how the secret engine works today but seems like it would be a reasonable request.

One other note is that the Terraform Consul provider can generate policies and tokens. You might be able to use Terraform to do the policy creation and Vault management instead of having to automate something yourself (I haven’t experimented with the Vault provider for Terraform so I can’t say for sure it would do what you want but it might be worth a look). I am envisioning something where you you have terraform create the policy on Consul, then the role in vault instead of needing to write your own code to do this.

1 Like

Thank you @mkeeler for your input. Now I start put all pieces together in my head. Indeed if you go “least-privelege” route, everything from top to bottom has to be least privileged, otherwise it does not make sense. So yes, in my scenario I have to have an instance dedicated role in Vault, that have a policy to generate a token against a dedicated ACL Consul policy. Otherwise it easier to go with prefixes.

As of vault from what I see in documentation they can generate a vault role and policy in Consul, but also there is a section that says “For Consul versions 1.4 and above, generate a policy in Consul”. I am not sure if that a recommendation as creation of Consul policy in Vault will be deprecated or it just a feature? Do you know a best place to ask?

Terraform providers would work if Consul policies would support templates, that would be a nice solution of Infrastructure as a Code and Security as a Code. But in automation it does not help and becomes as same as any manual service.

I will open feature request in Consul GitHub.

Vault supports both pre-1.4 ACLs (rules assigned directly to tokens) as well as the new ACL system introduced in 1.4 simultaneously. When Vault is using the legacy Consul ACL system, it embeds the rules specified directly within the tokens it generates. When Vault is using the new ACL system it requires a list of policies to give to the token as that is the only way a token can be given privileges using the new APIs. It might be possible to get Vault to use the legacy API even when using a newer Consul version (as the APIs have not been removed). However in your case it wouldn’t buy you much.

In addition to where the rules are defined, the new ACL system introduce exact matching rules. Previously all rules were prefix matching. So even if you could get Vault to use the legacy API and to attach rules to tokens, those rules would be no better than the policy that uses a node_prefix rule:

node_prefix "consul-node-" {
   policy = "write"
1 Like

Feature request opened here