Handling Maps which merge user configuration with API configuration

I’m writing a Terraform provider using the Terraform Plugin Framework for an API that merges provided map values with its own set of default map values for a particular attribute when a resource is created. These are not fixed, but can differ per-resource, so the provider can’t know ahead of time what the API-provided defaults will be, nor can it know which keys were user-provided and which were API-provided after the resource is created. The attribute is also immutable for each resource instance, so changes to the user-provided value require the resource to be replaced, and the API-provided defaults will never change on subsequent reads.

For example, if I made a POST request that writes the following value to the attribute:

{
  "user-provided-key": "example"
}

A subsequent GET request might return the following value:

{
  "user-provided-key": "example",
  "api-provided-key": "default"
}

Importantly, user-provided values always supersede API-provided defaults.

Naturally, if I try to naively model this in my Terraform provider by defining the attribute as an Optional Computed MapAttribute, then I get inconsistency errors, since the Read() value will contain keys that the planned value did not have.

Right now I’m modeling this by storing only the user-provided keys in the Terraform state in my Create() implementation, and taking the intersection of the keys present in the current state and the keys returned from the API in my Read() and Update() implementations (to allow the resource to be imported without immediately requiring replacement, if the user-provided values already match those returned by the API).

This is less than ideal though, since the user has no way to retrieve the values of API-provided defaults:

resource "example" "resource" {
  attribute = {
    user_provided_key = "example"
  }
}

# Error: Invalid index
output "api_provided_key" {
  value = example.resource.attribute.api_provided_key
}

Is the only alternative to provide a separate read-only Computed attribute for the merged attribute value? For example:

resource "example" "resource" {
  attribute = {
    user_provided_key = "example"
  }
}

# api_provided_key = "default"
output "api_provided_key" {
  value = example.resource.attribute_merged.api_provided_key
}

# user_provided_key = "example"
output "user_provided_key" {
  value = example.resource.attribute_merged.user_provided_key
}

I don’t particularly like this solution, since it’s not very ergonomic or intuitive for users, and my resource will have more than one attribute that follows this pattern, causing a lot of duplication of attributes.

hi @zanecodes

In the long run I think keeping the inputs separated from the outputs will make things much easier. While it seems inconvenient at first that the config doesn’t map directly to the the API, it will make the Terraform configuration more clear when the single resource is composed into larger configurations.

If you’re referencing a resource, you usually don’t need to care about what is already statically defined in the configuration, you could reference that data directly elsewhere. You are typically only concerned with the computed data from the referenced resource, and having that be consistently presented as documented outputs makes that easier to understand.

If you attempt to keep the inputs and outputs combined, hide the changes during apply, and update the value on the next read; you are going to end up with confusing situations where users can’t get a stable configuration after a single plan and apply.

The main thing that bothers me is that Maps are kind of an exception to how resource attributes behave in general with regard to merging user-provided and API-provided values. An attribute that is not configured by the user is left null, and Terraform doesn’t complain if the attribute value changes from null to some other API-provided value in the Terraform state, as long as it’s marked Computed.

For example, the API can add the api_provided_key to the resource with some default value, and this is fine:

resource "example" "resource" {
  user_provided_key = "example"
}

# api_provided_key = "default"
output "api_provided_key" {
  value = example.resource.api_provided_key
}

# user_provided_key = "example"
output "user_provided_key" {
  value = example.resource.user_provided_key
}

But this throws a Provider produced inconsistent result after apply error if the API adds the api_provided_key to map_attribute and it gets stored in the Terraform state:

resource "example" "resource" {
  map_attribute = {
    user_provided_key = "example"
  }
}

output "api_provided_key" {
  value = example.resource.map_attribute.api_provided_key
}

output "user_provided_key" {
  value = example.resource.map_attribute.user_provided_key
}

It seems like this should work too, as long as the user-provided value isn’t changed or overridden by the API.

I suppose the underlying reason this can’t be made to work is that there’s a semantic difference between the following Maps (e.g. length() and keys() will return different values for each, so length(example.resource.map_attribute) could change between plan and apply even though example.resource.map_attribute.user_provided_key could not):

{
  user_provided_key = "example"
}
{
  user_provided_key = "example"
  api_provided_key   = null
}

Whereas a resource’s attributes are statically defined and known ahead of time, so there is no semantic difference between two such resources.

I’m not entirely sure what you mean about hiding changes during apply and updating the value on the next read; in my current implementation, I take the intersection of the Read value and the user-configured value, so the value doesn’t change between applys. If additional default keys are added by the API, this is not reflected in the Read value due to the intersection operation, so the Terraform plan shows no changes to be made. The way the API works, user-provided values always override API-provided values, so the user-provided values will never change on subsequent Reads unless genuine drift has occurred.

I think you ended up finding the reasoning as you went through that response, but to be clear, the data model considers a map, in it’s entirety, to be the value which needs to remain consistent. Changing the keys in a map inherently changes the value of that entire map, but an object always has a static set of attributes.

Yes, if you want to suppress additions from the remote API it will work – I was thinking of the situation where you do want to expose those, and a lot of legacy providers tried to do just that, resulting in downstream errors or delayed changes when other resources try to plan using the inconsistent data.

That makes sense. Longer term, would it make sense for Terraform’s data model to permit partially-unknown values? For example, a plan could contain something like the following:

# example.resource will be created
+ resource "example" "resource" {
  + attribute = {
    + user_provided_key = "example"
    + (known after apply)
    }
  }

Values derived from a partially-unknown value could also be marked as entirely or partially-unknown, e.g. keys(example.resource.attribute) could be partially-unknown, length(example.resource.attribute) could be unknown, etc.

I’m not sure if this is something that’s already been considered and either planned for a future version or tabled as out of scope or too complex.

Partially unknown values are supported already, but what you are referring to would be a new more specific type of map refinement. The usefulness is limited, because you can’t iterate over the map since the total number of keys is unknown, so you would be limited to specific known indexes and if the indexes are known then you might as well use an object.

Terraform’s type system is determined by github.com/zclconf/go-cty, so playing with that library directly can give you lower level view of what is at all possible.

1 Like

Cool, thanks for the insight! In this case, the specific known indexes would be determined by the provider user, rather than the provider author, so I think that alone could still be useful; for instance, it allows users to directly reference map values defined by a resource and express the dependency between two resources at the same time, without the need for a local + depends_on:

resource "example" "resource" {
  other_attribute = "value" # this attribute determines which default keys will be provided by the API, which the provider can't know in advance, but the user can
  attribute = {
    user_provided_key = "example"
  }
}

resource "other" "resource" {
  some_attribute = example.resource.user_provided_key # Terraform knows that this key is present since the user specified it, and its value will be "example" since user-provided values override API-provided defaults
}

resource "third" "resource" {
  another_attribute = example.resource.api_provided_key # the user knows that this key will be present since they specified other_attribute, even if Terraform has no way of knowing that before apply
}

Yeah, that type of usage isn’t supported, nor could it really be supported without major changes to the underlying type system in Terraform and all the supporting provider libraries (which have a distinct but feature-compatible type-system). For starters, the current type system requires that an invalid map index return an error, which can’t be done if you don’t know the full set of keys.

The more fundamental problem is that the type of attribute is a simple map, so there’s no way to describe in a schema how the keys and attributes are going to be optional vs computed, and how to resolve planned changes. The provider can’t declare that attribute["user_provided_key"] is optional, but attribute["api_provided_key"] is computed, because they are part of a single value. So it’s not just a matter of whether the cty.Value is possible to use, but how can the provider define all the behaviors of that value when inputs come from both ends of the wire protocol.

Of course there’s probably some way to do these things, but that’s in the realm of “if Terraform were designed differently…”. Everything is a compromise, and in the years of working with hundreds of providers, the cases where this comes up are quite rare, and in those rare cases it can be solved by techniques like splitting the inputs and outputs.

1 Like