Trying to use setintersection() and keys() on filtering on vars and locals and getting different results

Hi tf community,

I have a question on how to use setintersection() and keys() on vars. I have used it on locals and it worked as I expected, but it doesn’t produce the same results on vars. wondering how I can use them on vars to achieve what I wanted to do?

# local.regions is to serve as the setintersection control set of strings 
# as the keys(s_v) has non-region-related keys
locals                    {
  regions               = ["region-1", "region-2"] 
}

# local.local_obj has the exact same data structure as of var.var_obj
locals                    {
  local_obj             = {
    collection_local    = {
      subnet_local      = {
        other_key       = "some-value"
        region-1        = "cidr-for-region-1-for-local-obj"
        # region-2        = "cidr-for-region-2-for-local-obj"
      }
    }
  }
}

# var.var_obj is a map(map(object())) data structure, outter map is the collection of subnets, 
# inner map is the subnets within a collection, object is the subnet config
variable "var_obj"        {
  type                  = map(
    map                   (
      object              ({
        other_key       = string
        region-1        = optional(string)
        region-2        = optional(string)
      })
    )
  )  
  default               = {
    collection_var      = {
      subnet_var        = {
        other_key       = "something-other-value"
        region-1        = "cidr-for-region-1-for-var-obj"
        # region-2        = "cidr-for-region-2-for-var-obj"
      }
    }
  }
}

# local.transform-local and local.transform-var are two transforming locals to suite subnet resource needs
locals                    {
  transform-local       = flatten([
    for c_k,c_v        in local.local_obj : [
      for s_k, s_v     in c_v : [
        for rg         in setintersection(local.regions, keys(s_v)) : {
          name          = "${rg}-${s_k}"
          region        = rg
          ip_cidr_range = s_v[rg]
        } 
      ]   
    ]
  ])  
  
  transform-var         = flatten([
    for c_k,c_v        in var.var_obj : [
      for s_k, s_v     in c_v : [
        for rg         in setintersection(local.regions, keys(s_v)) : {
          name          = "${rg}-${s_k}"
          region        = rg
          ip_cidr_range = s_v[rg]
        } 
      ]   
    ]
  ])  
}     

# as you can see, the transform-local and transform-var are doing the exact same thing: filtering a subset of the keys (when they are either region-1 or region-2, but not other_key). and when a local, or var value has only one of the region, then I want only that existing region's record to be produced by the transformation result.

# output.contrast is to show the different behavior of setintersection() and keys() on local and var
# as commenting out the region-2 in local and var has different behavior, 
# local behaves correctly, not producing region-2 record at all,
# but var transformer produced an incomplete region-2 record without ip_cidr_range, which will break the resource code execution
output "contrast"         {
  value                 = {
    transform-local     = local.transform-local
    transform-var       = local.transform-var
  }
}

if uncommenting the region-2 line, setintersection() and keys() behave exactly the same on local and var:

output:

contrast            : {
  transform-local   : [ 
    { 
      name          : "region-1-subnet_local"
      ip_cidr_range : "cidr-for-region-1-for-local-obj" 
      region        : "region-1" 
    } 
    { 
      name          : "region-2-subnet_local"
      ip_cidr_range : "cidr-for-region-2-for-local-obj" 
      region        : "region-2" 
    } 
  ]

  transform-var     : [ 
    { 
      name          : "region-1-subnet_var" 
      ip_cidr_range : "cidr-for-region-1-for-var-obj" 
      region        : "region-1" 
    } 
    { 
      name          : "region-2-subnet_var" 
      ip_cidr_range : "cidr-for-region-2-for-var-obj" 
      region        : "region-2" 
    } 
  ]
}

but if I comment out the region-2 line (as shown in code at the top), transform-var output is not desirable as it produces something that will cause malfunction

output:

contrast            : {
  transform-local   : [ 
    { 
      name          : "region-1-subnet_local"
      ip_cidr_range : "cidr-for-region-1-for-local-obj" 
      region        : "region-1" 
    } 
  ]

  transform-var     : [ 
    { 
      name          : "region-1-subnet_var" 
      ip_cidr_range : "cidr-for-region-1-for-var-obj" 
      region        : "region-1" 
    } 
    { 
      name          : "region-2-subnet_var" 
      region        : "region-2" 
    } 
  ]
}

a Hashicorp developer was trying to explain to me this should be expected behavior, but

  1. I don’t understand the explanation, as the default of var.var_obj has the region-2 commented out as well, same as in local.local_obj, the values are the same…
  2. how can I filter out the missing region-2 so no incomplete record will be created?

the explanation given:
the value of var.var_obj has a region-2 attribute and therefor shows region-2 values in the output. The difference between the var and local output is because the values are different. If you have more questions, please reference the community forums where there are many more user who may offer assistance.

Thanks for helping.

Hi @moonlightbeamer,

I think the important difference between the input variable and the local value in your example is that the local value has an explicit type constraint to guide Terraform’s evaluation of the default value, whereas the local value doesn’t have any explicit type information and so Terraform is trying to automatically infer what type you intended.

The { ... } syntax you used in your local value is the syntax for constructing object values. Terraform automatically creates a new object type whose attributes match the names and value types you wrote in the braces, and uses that as the type of the value. Therefore your local value’s type is something like this, if we were to write it in the type constraint syntax used for variables:

object({
  collection_local = object({
    subnet_local = object({
      other_key = string
      region-1  = string
    })
  })
})

Notice that this isn’t the same as the type constraint you declared for the input variable:

map(
  map(
    object({
      other_key  = string
      region-1   = optional(string)
      region-2   = optional(string)
    })
  )
)

Because Terraform has an explicit type constraint to use for the input variable, it can see that you intended the outer few layers of object to really be maps, and so it converts those objects to maps to match your type constraint. You also have optional(string) for two leaf attributes that type constraint, so Terraform can see that you intend that object to have three attributes other_key, region-1 and region-2, and so during conversion it will automatically insert the default value for the attribute region-2 which wasn’t in the source object. You didn’t write an explicit default and so the “default default” is null, representing that the attribute is unset.

If you want your local value to be treated the same as your input variable then you will need to write an expression which produces the same type of value as described in your input variable’s type constraint. Terraform doesn’t have any syntax for explicitly constraining the type of a local value, but you can use the type conversion functions like tomap to directly convert the values. So here’s a version of your local value that should match the type constraint:

locals {
  local_obj = tomap({
    collection_local = tomap({
      subnet_local = {
        other_key = "some-value"
        region-1  = "cidr-for-region-1-for-local-obj"
        region-2  = tostring(null)
      }
    })
  ])
}

Without a type constraint to guide conversion, we need to use some other techniques to ensure the correct type:

  • Use tomap around any object constructor that is intended to represent a map.
  • Explicitly set any unset optional attributes to a null of the appropriate type, like the tostring(null) here.

These two techniques together should allow Terraform to understand what data type you were intending to construct and so produce a value equivalent to the input variable.

This lack of ability to write out an explicit type constraint for a local value is an unfortunate legacy of much older versions of Terraform that didn’t have support for type constraints. The local value syntax of just assigning a value directly to a name doesn’t give any separate place to write the type constraint, so we need to imply the type by directly converting the values instead. But once you make the type constraints match I expect you’ll see both of these behave in the same way.

Thank you so much @apparentlymart . your this sentence is the key to clear my problem:

that explains why setintersection() sees region-2 is present even though it is NOT in the var.var_obj default values. it sees it present but with value null… that is really troublesome. so the optional() type didn’t do its job properly, per se, as when I say it is optional but didn’t give a value assignment, then it should be treated as no such an attribute, not to be treated as a null valued attribute… agree?

anyways, if that is what optional() truly is designed to do, I have nothing to say. but the solution you were trying to give, is not what I want to achieve, the bahavior of locals is what I want for var… I do NOT want to see this below record being produced by transform-var at all… so what shall I make the var behave like local in my case?

I have to keep the var outer layers to be map because the var is what I expect user to provide as custom value, I can’t predict what they use for the collection-var name and subnet-var name… but when they do not provide both region-x values, how can I make the insertion of null value to an non-exist key not happening? guess this is the issue of optional()?

in my current workaround, I removed the optional() to mandate user to always provide two region’s cidr range, it makes do, but I really try (if possible) to give user freedom to create non-symmetrical subnets between regions, which means create a subnet in one region but not the other… I now thinkg setintersection() is fine, but optional()'s behavior should change

I guess I could make the var.var_obj most inner layer as map instead of object as well, so no need to use optional() and use lookup to check on the region-1 and region-2 keys’ existence, but I wanted to give user explicit guide on what keys they should provide. bcs in reality, the “other_key” is a lot of keys, (all details for an subnet), I can’t ask user to provide those key and value out of thin air without the guidance an object would give them… any thoughts? tks.

I recall there was a huge amount of discussion when optional was being finalized, and not everyone was happy with the final behaviour - but what’s implemented now won’t be able to change again, due to compatibility.

I think your best course of action for you here, is to add an if clause to your for expression, to filter out region keys which are present with a value of null. i.e.

        for rg         in setintersection(local.regions, keys(s_v)) : {
          name          = "${rg}-${s_k}"
          region        = rg
          ip_cidr_range = s_v[rg]
        } if s_v[rg] != null

Optional attributes were never intended to change the type of a value, since that could render other conversions or transformations invalid. The use of optional is solely so that you can optionally specify attributes, and have a suitable default be inserted otherwise. The reason to use region-2 = optional(string) is to ensure that there is always a region-2 attribute on the object. If you do not want an object with predefined attributes, you need to use a map (or an any type constraint, which of course impacts handling of possibly unknown types)

If you want a map with only a subset of keys, you would have to use other means of validation to prevent unwanted keys.

this if worked very well. thanks @maxb ! I tried if before but not to test on the nullable hidden s_v[rg] as I didn’t even know it exists without @apparentlymart pointing it out for me. thanks both! now I can achieve what I want to without sacrificing functionality I want to keep.

Hi @moonlightbeamer,

It seems like you are perhaps coming from a background of languages with dynamic type systems where e.g. a map just contains arbitrary values that could all be of any type, and so it’s possible and reasonable to have one map element of one object type and another map element of another object type.

Broadly speaking that is not the kind of type system Terraform has. It is not strictly true to say that Terraform has a static type system either – it’s more of a hybrid to allow for escape hatches like jsondecode to return arbitrary data types, and to avoid the need to write explicit type constraints everywhere and focus mainly on constraining API boundaries (inputs and outputs from modules and providers).

But one specific design that Terraform does inherit from statically typed languages is that a collection type like a map has a single element type that must be true for all elements of the map. If you say that your map is of an object with three specific attributes names then all elements of the map must be of that type.

Historically Terraform enforced that by requiring callers to always set all of the attributes, but that was a hazard for future evolution of a module because it made the addition of a new attribute always a breaking change.

The optional attributes mechanism is a compromise to allow successful conversion of an object that is lacking one if the attributes into an object that has all of the required attributes by selecting a default value to use when the source value lacks that attribute. If you don’t specify a default then the default is null, which is consistent with how provider arguments behave when you don’t set them: null here represents the absence of a value, allowing the object to still conform to the type even though it’s dynamically unset.

My suggestion for how to represent your situation, assuming I understood it correctly, would be to represent your regions as a nested map instead of as part of the object type. The attributes of an object are part of its type but the keys of a map are not part of its type, and so a map is the more appropriate data type for when you need the keys to vary between values:

object({
  other_key = string
  regions = map(string)
})

When making design decisions like this it might help to think of an object type as being like a class or “struct” in other languages: it has a fixed set of attributes each of a fixed type and all values of that type have those attributes. The syntax in Terraform is different, but the principle is the same. For example, in Go I might describe the above type like this:

struct {
  OtherKey string
  Regions map[string]string
}

The Go compiler will require that all values of this type have both fields, but will not require any particular keys in the regions map. In Go’s case it does allow an expression producing this type to omit either field, but Go doesn’t have a concept of default values so it uses what it calls “the zero value” instead. Not a perfect analogy then, but analogical between languages rarely are. I hope it’s a useful aid to seeing how Terraform thinks about object types, nonetheless.

Yes, I am aware of this now. but really, in my opinion, I would protest against

but since the “if != null” made do, I am happy now. this might have to be the official companion method to optional() if someone can update the documentation and put an example like this (use of if) that will be helpful to the people running into this in the fiure. just my two cents.

and please make this optional() behavior explicitly documented. as it was not mentioned at all to my knowledge. so I was left puzzled why region-2 showed up while I commented it out.

Hi @moonlightbeamer,

Are you referring to the documentation under Optional Object Type Attributes, or did you refer to some other source when you were learning about this feature?

Thanks @apparentlymart you just pointed out that I am not a good document reader as I thought I was :smiley:. I took for granted.

I didn’t read this paragraph, my bad. So it IS documented well. Thanks again… Would it be possible for you to share why the decision is to insert a null value and not just ignore it? Is it not possible to get implemented? (Meaning the inner logic is still every attribute is needed, just spare the user to provide it for some instances) Or this was decided to be the better way. Tks!!

As he posted earlier the point of an object is that is has a specific structure. Before the optional feature because available you’d have to fill in every single attribute of that object. The idea behind optional is not to change how objects work (i.e. having a specified fixed list of attributes of listed types) but to make it easier to use. Instead of having to set everything to null manually, or have additional code to default values the language can now do that for you.

Tks. That’s fair. Can’t be too greedy on features of “non fixed object”. esp the “if” expression should make it up for missing attributes if needed

No worries @moonlightbeamer! There’s a lot of text there, so I don’t blame you for skimming it.

My question was not meant to call you out but rather to try to figure out which documentation you were referring to, because when we get documentation feedback it often turns out to be poor discoverability of documentation that already exists rather than totally missing documentation, and so we’d take a different approach (improving links between pages, changing the navigation, etc) if that were the cause.

I see that @stuart-c already answered the question about why it behaves that way, so I won’t repeat it again. Thanks for the feedback!