It doesn’t happen every time. But it does happen sometimes. The external-dns pod from the helm chart comes up, and gets only the nodes iam role instead of getting the pod identity it should be using.
Here is the helm chart
resource "helm_release" "external_dns" {
max_history = 10
namespace = kubernetes_namespace.external_dns_namespace.metadata.0.name
create_namespace = false
name = "external-dns"
repository = "https://kubernetes-sigs.github.io/external-dns"
chart = "external-dns"
version = var.versions.external_dns
values = []
set {
name = "policy"
value = "sync"
}
set {
name = "txtOwnerId"
value = "${var.cluster_name}-${data.aws_route53_zone.cluster_hosted_zone.zone_id}"
}
set {
name = "imagePullSecrets[0].name"
value = kubernetes_secret.external_dns_image_pull.metadata.0.name
}
set_list {
name = "domainFilters"
value = [
data.aws_route53_zone.cluster_hosted_zone.name
]
}
set {
name = "serviceAccount.create"
value = false
}
set {
name = "serviceAccount.name"
value = kubernetes_service_account.external_dns_service_account.metadata.0.name
}
depends_on = [aws_eks_pod_identity_association.eks_pod_identity]
}
The namespace is directly referenced.
The service account is directly referenced.
The aws_eks_pod_identity_association is in a depends on.
So everything should be in place before this helm chart is deployed. Which should mean that the pod from the helm chart should use the pod identity role.
Yet sometimes it has the node iam role. Simply deleting the pod and letting a new one come up solves the problem. But obviously that isn’t a solution.
Some additional code
resource "kubernetes_service_account" "external_dns_service_account" {
metadata {
name = "external-dns"
namespace = kubernetes_namespace.external_dns_namespace.metadata.0.name
labels = {} # put this here otherwise plans will show perpetual diff
}
depends_on = [aws_eks_pod_identity_association.eks_pod_identity]
}
resource "kubernetes_namespace" "external_dns_namespace" {
metadata {
labels = {
name = "external-dns"
}
name = "external-dns"
}
}
This is the only place that is even a little questionable, because it depends on something that has a foreach in it. But it correctly depends on the top level resource, not one of the instances. So it should wait for all of them to be done before allowing the helm chart to happen.
locals {
namespace_role_products = [
{
namespace = kubernetes_namespace.external_dns_namespace.metadata.0.name,
role_arn = var.aws_iam_role_arn_external_dns,
service_account = "external-dns"
},
{
namespace = kubernetes_namespace.balancer.metadata.0.name,
role_arn = var.aws_iam_role_arn_balancer,
service_account = "balancer"
},
{
namespace = kubernetes_namespace.cert_manager.metadata.0.name,
role_arn = var.aws_iam_role_arn_cert_manager,
service_account = "cert-manager"
}
]
}
resource "aws_eks_pod_identity_association" "eks_pod_identity" {
for_each = {
for product in local.namespace_role_products : "${product.namespace}:${product.service_account}" => product
}
cluster_name = var.cluster_name
namespace = each.value.namespace
role_arn = each.value.role_arn
service_account = each.value.service_account
}
So any thoughts on why the external dns pod could come up before the pod identity stuff is squared away?