Hi,
I’m attempting to deploy Ingress Gateways through consul-helm on a secondary datacenter. The service-init container refuses to complete due to missing permissions:
Address "10.1.6.66" written to /tmp/address.txt successfully
Error registering service "ingress-gateway": Unexpected response code: 403 (Permission denied: Missing service:write on ingress-gateway)
I’ve confirmed that replication is working on the secondary DC servers. The following ACL (via the secret consul-ingress-gateway-ingress-gateway-acl-token) is configured with the appropriate policy. This was created by the server-init-acl job as part of the helm deployment and I can see that the policy is synced from the primary DC as well.
$ consul acl policy read -name ingress-gateway-ingress-gateway-token
ID: 1b245ea1-e799-612c-da5d-cdfffe6f4b50
Name: ingress-gateway-ingress-gateway-token
Description: ingress-gateway-ingress-gateway-token Token Policy
Datacenters:
Rules:
service "ingress-gateway" {
policy = "write"
}
node_prefix "" {
policy = "read"
}
service_prefix "" {
policy = "read"
}
$ consul acl token read -id 713fa743-2e91-12c4-3d7f-648420542872
AccessorID: 713fa743-2e91-12c4-3d7f-648420542872
SecretID: <redacted, confirmed to match the k8s secret>
Description: ingress-gateway-ingress-gateway-token Token
Local: true
Create Time: 2021-05-28 15:55:30.376686108 +0000 UTC
Policies:
1b245ea1-e799-612c-da5d-cdfffe6f4b50 - ingress-gateway-ingress-gateway-token
Any idea whats going on here?
PS: When the original policy was created by consul-k8s on the primary DC, only that single DC was listed under the policy’s “valid datacenters”. I manually patched it to be valid for all DCs so it would be picked up by the secondary DC.
_policy_id=$(sudo -E -u consul consul acl policy read -name ingress-gateway-ingress-gateway-token -format=json | jq -r '.ID')
_policy_rules=$(sudo -E -u consul consul acl policy read -name ingress-gateway-ingress-gateway-token -format=json | jq -r '.Rules')
_policy_description=$(sudo -E -u consul consul acl policy read -name ingress-gateway-ingress-gateway-token -format=json | jq -r '.Description')
jq --arg id "${_policy_id}" \
--arg name "ingress-gateway-ingress-gateway-token" \
--arg desc "${_policy_description}" \
--arg rules "${_policy_rules}" \
'{ID:$id, Name:$name, Description:$desc, Rules:$rules, Datacenters:[]}' <<< '{}' > policy-update.json
sudo -u consul curl -X PUT --data @policy-update.json \
--header "X-Consul-Token: ${CONSUL_HTTP_TOKEN}" \
--cacert ${CONSUL_CACERT} \
--cert ${CONSUL_CLIENT_CERT} \
--key ${CONSUL_CLIENT_KEY} \
${CONSUL_HTTP_ADDR}/v1/acl/policy/${_policy_id}
Hey @lkysow, here’s the Helm values for each. Both ingress gateways are using the default names derived from the consul-helm chart. Is that the cause of this issue potentially (name conflicts on the consul side)? I ended up creating an identical policy and assigning it to the DC2 local token. That seems to have solved the problem.
The original problem seems to be related to the policy not getting applied to the local token in the secondary DC even after manually updating said policy to be valid in all datacenters (see the note in my original post above).
Yeah that’s the problem. Kubernetes will create a local ACL token for the ingress gateway and if it has the same name as a previous gateway then it will think it’s already created.
So looks like we require separate names for each gateway across dc’s.
I’ll create a ticket to document this and another ticket to error out during acl initialization.
Well actually if you’re using federation through Helm then this works: we suffix the policies with the datacenter names. But it looks like you’re doing federation outside of helm?