Cycle Error in Terraform When Using Subnets, NAT Gateways, NACLs, and ECS Service

I’m facing a cycle error in my Terraform configuration when deploying an AWS VPC with public/private subnets, NAT gateways, NACLs, and an ECS service. Here’s the error message

Error: Cycle: module.app.aws_route_table_association.private_route_table_association[1] (destroy), module.app.aws_network_acl_rule.private_inbound[7] (destroy), module.app.aws_network_acl_rule.private_outbound[3] (destroy), module.app.aws_network_acl_rule.public_inbound[8] (destroy), module.app.aws_network_acl_rule.public_outbound[2] (destroy), module.app.aws_network_acl_rule.private_inbound[6] (destroy), module.app.local.public_subnets (expand), module.app.aws_nat_gateway.nat_gateway[0], module.app.local.nat_gateways (expand), module.app.aws_route.private_nat_gateway_route[0], module.app.aws_nat_gateway.nat_gateway[1] (destroy), module.app.aws_network_acl_rule.public_inbound[7] (destroy), module.app.aws_network_acl_rule.private_inbound[8] (destroy), module.app.aws_subnet.public_subnet[0], module.app.aws_route_table_association.public_route_table_association[1] (destroy), module.app.aws_subnet.public_subnet[0] (destroy), module.app.local.private_subnets (expand), module.app.aws_ecs_service.service, module.app.aws_network_acl_rule.public_inbound[6] (destroy), module.app.aws_subnet.private_subnet[0] (destroy), module.app.aws_subnet.private_subnet[0]

I have private and public subnets, with associated route tables, NAT gateways, and network ACLs. I’m also deploying an ECS service in the private subnets. Below is the full Terraform configuration that’s relevant to the cycle issue

resource "aws_subnet" "public_subnet" {
  count  = length(var.availability_zones)
  vpc_id = local.vpc_id
  cidr_block              = local.public_subnets_by_az[var.availability_zones[count.index]][0]
  availability_zone       = var.availability_zones[count.index]
  map_public_ip_on_launch = true
}

resource "aws_subnet" "private_subnet" {
  count  = length(var.availability_zones)
  vpc_id = local.vpc_id
  cidr_block              = local.private_subnets_by_az[var.availability_zones[count.index]][0]
  availability_zone       = var.availability_zones[count.index]
  map_public_ip_on_launch = false
}

resource "aws_internet_gateway" "public_internet_gateway" {
  vpc_id = local.vpc_id
}

resource "aws_route_table" "public_route_table" {
  count  = length(var.availability_zones)
  vpc_id = local.vpc_id
}

resource "aws_route" "public_internet_gateway_route" {
  count                  = length(aws_route_table.public_route_table)
  route_table_id         = element(aws_route_table.public_route_table[*].id, count.index)
  gateway_id             = aws_internet_gateway.public_internet_gateway.id
  destination_cidr_block = local.internet_cidr
}

resource "aws_route_table_association" "public_route_table_association" {
  count          = length(aws_subnet.public_subnet)
  route_table_id = element(aws_route_table.public_route_table[*].id, count.index)
  subnet_id      = element(local.public_subnets, count.index)
}

resource "aws_eip" "nat_eip" {
  count  = length(var.availability_zones)
  domain = "vpc"
}

resource "aws_nat_gateway" "nat_gateway" {
  count         = length(var.availability_zones)
  allocation_id = element(local.nat_eips, count.index)
  subnet_id     = element(local.public_subnets, count.index)
}

resource "aws_route_table" "private_route_table" {
  count  = length(var.availability_zones)
  vpc_id = local.vpc_id
}

resource "aws_route" "private_nat_gateway_route" {
  count                  = length(aws_route_table.private_route_table)
  route_table_id         = element(local.private_route_tables, count.index)
  nat_gateway_id         = element(local.nat_gateways, count.index)
  destination_cidr_block = local.internet_cidr
}

resource "aws_route_table_association" "private_route_table_association" {
  count          = length(aws_subnet.private_subnet)
  route_table_id = element(local.private_route_tables, count.index)
  subnet_id      = element(local.private_subnets, count.index)
  # lifecycle {
  #   create_before_destroy = true
  # }
}

resource "aws_network_acl" "private_subnet_acl" {
  vpc_id     = local.vpc_id
  subnet_ids = local.private_subnets
}

resource "aws_network_acl_rule" "private_inbound" {
  count           = local.private_inbound_number_of_rules
  network_acl_id  = aws_network_acl.private_subnet_acl.id
  egress          = false
  rule_number     = tonumber(local.private_inbound_acl_rules[count.index]["rule_number"])
  rule_action     = local.private_inbound_acl_rules[count.index]["rule_action"]
  from_port       = lookup(local.private_inbound_acl_rules[count.index], "from_port", null)
  to_port         = lookup(local.private_inbound_acl_rules[count.index], "to_port", null)
  icmp_code       = lookup(local.private_inbound_acl_rules[count.index], "icmp_code", null)
  icmp_type       = lookup(local.private_inbound_acl_rules[count.index], "icmp_type", null)
  protocol        = local.private_inbound_acl_rules[count.index]["protocol"]
  cidr_block      = lookup(local.private_inbound_acl_rules[count.index], "cidr_block", null)
  ipv6_cidr_block = lookup(local.private_inbound_acl_rules[count.index], "ipv6_cidr_block", null)
}

resource "aws_network_acl_rule" "private_outbound" {
  count           = var.allow_all_traffic || var.use_only_public_subnet ? 0 : local.private_outbound_number_of_rules
  network_acl_id  = aws_network_acl.private_subnet_acl.id
  egress          = true
  rule_number     = tonumber(local.private_outbound_acl_rules[count.index]["rule_number"])
  rule_action     = local.private_outbound_acl_rules[count.index]["rule_action"]
  from_port       = lookup(local.private_outbound_acl_rules[count.index], "from_port", null)
  to_port         = lookup(local.private_outbound_acl_rules[count.index], "to_port", null)
  icmp_code       = lookup(local.private_outbound_acl_rules[count.index], "icmp_code", null)
  icmp_type       = lookup(local.private_outbound_acl_rules[count.index], "icmp_type", null)
  protocol        = local.private_outbound_acl_rules[count.index]["protocol"]
  cidr_block      = lookup(local.private_outbound_acl_rules[count.index], "cidr_block", null)
  ipv6_cidr_block = lookup(local.private_outbound_acl_rules[count.index], "ipv6_cidr_block", null)
}

resource "aws_ecs_service" "service" {
  name                = "service"
  cluster             = aws_ecs_cluster.ecs.arn
  task_definition     = aws_ecs_task_definition.val_task.arn
  desired_count       = 2
  scheduling_strategy = "REPLICA"

  network_configuration {
    subnets          = local.private_subnets
    assign_public_ip = false
    security_groups  = [aws_security_group.cluster_sg.id]
  }
}

Hi @hanrat,

The configuration you have here is not quite complete enough to try and determine what could be going on. A number of resources are connected through local values, but we don’t have those local values to see how they are connected. You also have a aws_nat_gateway.nat_gateway which is connected to local.nat_gateways which isn’t shown in the configuration.

I think in this case however you are going to have to retrace some of your steps to figure what has gone wrong. Given the commented out create_before_destroy option, I’m guessing that you created some resources with a certain dependency order, and have since changed the configuration in such a way that it conflicts with those existing resource’s dependencies.

My generic advice would be to revert the configuration to the point where you can apply it cleanly in its entirety, and make smaller incremental changes to narrow down how you created the dependency cycle. Trying to refactor a configuration while simultaneously planing many changes can make it very difficult to understand how the data flow is happening in that combined set of operations.