Security group rules destroyed instead of added

Hi :slight_smile: I’d love to get some help with creating security groups rules dynamically. I’ve simplified most of the logic so you can reproduce as well.

I have 3 files:

  1. main.tf
resource "aws_security_group_rule" "bp_service_sg_rule" {                                                                                                     
  for_each          = { for index, rules in local.security_group_rules : index => rules }
  type              = "ingress"
  from_port         = each.value.open_port
  to_port           = each.value.open_port
  protocol          = "tcp"
  security_group_id = "sg-0e5e36abc940a1a2a"
  cidr_blocks       = [each.value.to_group]
}
  1. variables.tf
locals {                                                                                                                                                      
  security_groups_file = yamldecode(file("${path.module}/security_groups.yml"))
  security_group_rules = flatten([
    for service_name, rules in local.security_groups_file.groups : [
      for rule in rules : {
        service_name = service_name
        open_port    = rule.open_port
        to_group     = rule.to_group
      }
    ]
  ])
}
  1. security_groups.yml
groups:                                                                                                                                                       
  test:
  - open_port: 22
    to_group: 4.4.4.4/32

The simplified logic is:

I go over security_groups.yml file with a loop to add the rules I want for the specific security_group. Example:

  test:      <===== The name of the security group                                                                                                                          
  - open_port: 22   <===== The port number that I want to open   
    to_group: 4.4.4.4/32    <===== The ip address I want to open it to.

If I want to add another rule to the list, like this:

  test:
  - open_port: 22             <====== The new
    to_group: 3.3.3.3/32    <======    rule
  - open_port: 22                                                                                                                                             
    to_group: 4.4.4.4/32

Instead of adding the rule, Terraform destroys it and recreates it, like this:

  # aws_security_group_rule.bp_service_sg_rule["0"] must be replaced
-/+ resource "aws_security_group_rule" "bp_service_sg_rule" {
      ~ cidr_blocks              = [ # forces replacement
          - "4.4.4.4/32",
          + "3.3.3.3/32",
        ]
        from_port                = 22
      ~ id                       = "sgrule-3857769040" -> (known after apply)
      - ipv6_cidr_blocks         = [] -> null
      - prefix_list_ids          = [] -> null
        protocol                 = "tcp"
        security_group_id        = "sg-0e5e36abc940a1a2a"
        self                     = false
      + source_security_group_id = (known after apply)
        to_port                  = 22
        type                     = "ingress"
    }

  # aws_security_group_rule.bp_service_sg_rule["1"] will be created
  + resource "aws_security_group_rule" "bp_service_sg_rule" {
      + cidr_blocks              = [
          + "4.4.4.4/32",
        ]
      + from_port                = 22
      + id                       = (known after apply)
      + protocol                 = "tcp"
      + security_group_id        = "sg-0e5e36abc940a1a2a"
      + self                     = false
      + source_security_group_id = (known after apply)
      + to_port                  = 22
      + type                     = "ingress"
    }

The problem is: we keep a file with ALL security groups of our cluster, when we make one simple change to a specific security group, Terraform destroys ALL rules and recreates them and thats a NO GO since it shuts us down until it finishes.

Hi @denzhel,

Terraform tracks the individual instances of aws_security_group_rule.bp_service_sg_rule by the map keys you selected in your for_each expression. Because you used the index from the list as the key, Terraform believes that the position of the items in the list is the unique identifier, and when you insert a new item in the middle of the list you implicitly change the position of all of the items after it.

The answer here is to select a different key that represents what you consider to be the unique identifier for a rule. In this case it seems like a combination of open_port and to_group would make a sufficient unique identifier, in which case we can adjust your for_each expression to produce such a key like this:

  for_each = { for rules in local.security_group_rules : "${rules.to_group} ${rules.open_port}" => rules }

With the example settings you showed, the above would cause Terraform to track your initial single rule with the following address:

aws_security_group_rule.bp_service_sg_rule["4.4.4.4/32 22"]

Then when you add the new rule you indicated, Terraform will see that it should have the following address:

aws_security_group_rule.bp_service_sg_rule["3.3.3.3/32 22"]

…and, importantly, Terraform can see that the element for "4.4.4.4/32 22" is still present in the map and unchanged, so it knows that no action is required on that existing instance. It should then just plan to create the new instance, corresponding to the new element in your map:

  # aws_security_group_rule.bp_service_sg_rule["3.3.3.3/32 22"] will be created
  + resource "aws_security_group_rule" "bp_service_sg_rule" {
      ....
2 Likes

Wow !! Thank you, Thank you and thank you :slight_smile:
You helped us a lot !!!

It works.

Hi @apparentlymart

I’m facing similar problem: I have dynamic aws_security_group_rules that has cidr blocks from cluster nodes :

resource "aws_security_group_rule" "Galera_cloud_udp" {
  for_each = toset(local.all_nodes)
...
  cidr_blocks       = ["${local.all_cidrs[each.value]}/32"]
}

locals {
  all_nodes = concat(var.aws_nodes, var.gcp_nodes)
  gcp_cidr  = { for nodeName in var.gcp_nodes : nodeName => google_compute_instance.hitachi-dev-google[nodeName].network_interface.0.access_config.0.nat_ip }
  aws_cidrs = { for nodeName in var.aws_nodes : nodeName => aws_instance.hitachi-dev[nodeName].public_ip }
  all_cidrs = tomap(merge(local.aws_cidrs, local.gcp_cidr))
}

variable "aws_nodes" {
  default = ["aws1", "aws2"]
}

When I want to scale up, I’ll change aws_nodes to [“aws1”, “aws2”, “aws3”]

But Terraform gives me:

 # aws_security_group_rule.Galera_cloud_udp["aws2"] must be replaced
-/+ resource "aws_security_group_rule" "Galera_cloud_udp" {
      ~ cidr_blocks              = [
          - "35.159.53.207/32",
        ] -> (known after apply) # forces replacement

seems one “known after apply” in the all_cidrs map makes the map completely “known after apply”…

Can I somehow make aws_security_group_rule update in-place rather than destroy/create? I can see other resources are behaving as expected

Many Thanks

Jan

Hi @honzakadlec,

I’m not sure exactly what’s going on from what you shared here. One thing that might be relevant (but I’m not sure) is that Terraform requires that all of the keys in a map must be known values, or else the entire map is treated as unknown. That’s because without knowing the keys it’s impossible to know how many elements are present, what their keys all are, etc, and so typical map operations like indexing would become undecidable.

In your example you’ve shown what appear to be literal string values for var.aws_nodes, and so I would expect all of the nodeName values in your for expressions to be known and therefore the results to be known. Is that true for your real configuration (both in var.aws_nodes and var.gcp_nodes), or did you replace those with constant values to make the example simpler to share?

Hi @apparentlymart,

thank you for quick response, yes, both var.aws_nodes and var.gcp_nodes are static arrays - so they are known keys in the map at the “apply” step. It serves as input source for “scaling” - we just put one more node in it and expect Terraform to scale infrastructure up…

I’ve tried to divide the run to two steps:

terraform apply -target='aws_instance.hitachi-dev["aws3"]'

  • this created the aws3 instance only

and

terraform apply

  • this had all IPs known, so it created rest of the resources as expected…

So your thought about undecidable operations in map marked as “unknown” is probably the case of this… To make Terraform being able to handle that, it would probably need to mark some values in map as “unknown”, not complete map…

Thank you again

Jan

Terraform maps can have unknown values, just not unknown keys. If all of your keys are known then the map should have a known number of elements and only the individual new values would be potentially unknown.

I think what you are seeing here is a bug that has been recently fixed and will be included in the forthcoming 0.13.0 release, but is still present in the latest 0.12.x release. This issue relates specifically to how the merge function reacts to unknown values.

If I’m right about that cause, it might help to replace your use of the merge function with a more verbose approach that skips that function for now, like the following:

  all_cidrs = zipmap(
    concat(keys(local.aws_cidrs), keys(local.gcp_cidrs)),
    concat(values(local.aws_cidrs), values(local.gcp_cidrs)),
  )

Here I used zipmap instead of merge in a way that ought to be functionally equivalent but should avoid the bug with unknown variable handling in merge. This shouldn’t be necessary anymore once 0.13.0 is released and you’ve been able to upgrade to it, but hopefully it allows you to make progress in the meantime.

Hi @apparentlymart,

you are correct, your suggestion worked like a charm!

Thank you again for your help

Jan