Does Terraform consider a change in a computed value a non-change?

Hi :wave:t2:

I have an attribute called source_code_hash that is set to Computed: true. I also have another attribute called filename which is a file path.

In my provider I have a Read CRUD method that generates a hash of the file and updates the internal state value for source_code_hash (using d.Set()) if the file has changed (as a change in the file content would obviously result in a new hash being produced).

When using the Delve step debugger I can see that the internal state representation shows the source_code_hash attribute with the new hash value, but at the end of running terraform plan the output suggests there are no differences. If I then manually check the terraform.tfstate file I still see the old value for source_code_hash is set for source_code_hash (as if calling d.Set() hadn’t succeeded, although step debugging suggests otherwise, and also that call doesn’t error).

So there are two issues:

  1. Although I’m calling d.Set() in my ‘Read’ CRUD method to update the source_code_hash attribute (and it looks like it’s changing internally), the end result is the state file hasn’t changed and I think that is fundamentally driving the second issue.

  2. Changing the state doesn’t trigger my ‘Update’ CRUD method to be called (i.e. Terraform doesn’t think there’s anything to do).

I feel like this is somehow fundamentally caused by the source_code_hash attribute being a Computed value. But I’m not sure why that would be.

I’m taking a guess here, but because source_code_hash is a computed attribute, and the user otherwise hasn’t modified the filename attribute, then I’m thinking that although I’ve called d.Set() to update the state (and calling d.Get() returns the updated value) maybe Terraform doesn’t persist the state back to disk because the user hasn’t updated filename and as source_code_hash is ‘computed’ Terraform sees no point in persisting that? But I honestly would have thought if you change the computed attribute then that would be important to keep stored in the state file. Similarly, maybe as only the ‘computed’ attribute has been modified (internally by my provider) Terraform considers nothing changed. Again I’m just taking guesses here as I’ve no idea.

Any guidance/help appreciated here as I’m a bit stuck now.

Many thanks!

Terraform considers a “change” to be anything it needs to take action to apply. If reading the new computed value does not result in any changes to be made, then the new value is inconsequential and there is nothing to apply. If you need to have some sort of side effect from a computed value, you could assign it to a root module output.

If you simply want to store the updated state, then you can create a -refresh-only plan which will apply the saved state regardless.

Hi @jbardin thanks for the reply.

So currently our source_code_hash attribute is marked as both computed and optional. What happens is, as part of the read operation to get the latest data from the API, we get a hash sum back from the API and store it in the source_code_hash attribute. The user is expected to provide a value for this attribute in their configuration so that Terraform can compare the latest hash sum in the state file against the hash sum provided by the user in their configuration.

For example, the user would set something like the following in their configuration:

source_code_hash = filesha512("package.tar.gz")

All is well at this point.

But in the near future we’re going to need to make an API change that means the hash sum returned from the API is no longer based on a single file like it has historically been (e.g. a package.tar.gz) but a hash of specific files we expect to be inside of the .tar.gz file (e.g. a “main.wasm” and a “fastly.toml”).

This means if the user continues to pass a hash that is of the .tar.gz file, then the hash will never match with what’s coming back from the API and as such the Terraform provider will end up always triggering an update and re-uploading a package file that could well not actually have changed because we’re comparing two different hash sums.

The user still needs to provide the containing .tar.gz file though as there’s other stuff in there that the API needs.

Do you have any recommendations for this scenario?

Basically, the problem I’m having is figure out how to do something like:

source_code_hash = filesha512("main.wasm" + "fastly.toml")

…which of course is nonsensical code. This means we have to push this API behaviour change back onto the user to figure out (e.g. they have to calculate the hash of these two specific files themselves manually).

Now if that’s just the way it is then fine, but ultimately this is why my original question was posed around the source_code_hash being computed only (rather than also ‘optional’) as I was trying to figure out a way that I could move the source_code_hash attribute to be internal and have our provider figure out the hash sum for the user rather than forcing them to do it.

NOTE: Another reason why I was looking to internalise source_code_hash was because we wanted to expose a url attribute, so if the user didn’t want to provide the file via the filename attribute they could instead provide a URL to the .tar.gz file and the provider would download it. The problem with that is the user wouldn’t have the file on their local machine, and so they wouldn’t be able to provide a value for source_code_hash without first downloading the file themselves and then calculating the hash. Hence I wanted to try and internalise source_code_hash so the user didn’t have to think/worry about it.

Now, going back to the original problem. One option I was thinking of, was that we could expose two new attributes:

  • wasm_binary (i.e. the “main.wasm” file content)
  • manifest (i.e. the “fastly.toml” file content)

Terraform would be able to see from the content whether those files have changed (as the user is providing them via configuration attributes) and trigger an update. The annoying part of it is that we’re again forcing users to provide more specific information that feels a bit tedious. But I don’t see how we can handle things otherwise.

Anyway, thank you for your help so far.

I’m not sure exactly what you’re looking for here. It sounds as if you don’t want to make the user have to calculate source_code_hash, and have the provider do it. If the provider is calculating it, and you don’t want to user to verify it manually, then why have the user provide it in the first place? The only way for the user to add the value and ensure it is correct is to calculate it themselves, otherwise they are just reflecting back what the provider claims is correct already.

There may be some confusion here around what the provider can actually do with these values. If the user adds the value to the configuration, the provider is not allowed to alter that value. The provider however is allowed to get away with some misbehaviors if it is using the legacy SDK, since exceptions had to be made for backwards compatibility. You will see WARN lines in the core logs about incorrect plans or applies if this is the case. A good reference document on the resource lifecycle can be found here: Terraform Resource Instance Change Lifecycle

I suspect the goal here is to mimic the pattern used by some resource types like aws_lambda_function in the hashicorp/aws provider, where source_code_hash is a workaround for the fact that the resource type takes a filename rather than an actual file and the filename might not change even if the file contents have.

In that case the provider docs encourage the author to write something like this:

  filename         = "${path.module}/function_source.zip"
  source_code_hash = filebase64sha256("${path.module}/function_source.zip")

The provider updates source_code_hash during its refresh step using a base64-encoded SHA256 checksum returned by the remote API. During the planning step if the actual file on disk has changed then the source_code_hash value in the configuration then no longer matches the value retrieved from the remote system and so the provider knows that the function needs to be updated even though the filename argument matches between the configuration and the prior state.

The crucial parts of making this pattern work are:

  • The source_code_hash argument is settable in the configuration, so that there’s something to compare with the remote system. That means it must either have Required: true or Optional: true, Computed: true.
  • The source_code_hash argument contains a checksum generated using an algorithm that the remote API and the Terraform configuration can both generate. (In the case of aws_lambda_function that’s a SHA256 checksum because that’s what the API returns in the CodeSha256 property of FunctionConfiguration.)

It sounds like the main challenge here will be the second point: the API is going to return a checksum in a way that isn’t possible (or isn’t convenient) for the module author to generate using the functions in the Terraform language. I don’t have a ready-to-go solution for that problem; unfortunately I think you will need to find some way for the module to specify checksums in a way that matches what the API will return, which may indeed require a more verbose configuration if it isn’t as straightforward as just the checksum of a single file.

You are spot on! This is exactly the pattern we have at the moment.

The problem we’re facing is that our API will soon be changing to generate a hashsum which is no longer based on a file such as a package.tar.gz but the contents of multiple files within that tar.gz.

This presents a challenge for users of the provider who were doing:

source_code_hash = filesha512("${path.module}/package.tar.gz")

Because now they need to (in pseudo-code) do something like:

source_code_hash = filesha512("${path.module}/src/A" + "${path.module}/src/B")

This is why I started to investigate if we could just avoid the user having to set source_code_hash themselves because as long as we have the filename attribute we can get the provider to extract the relevant files from the package.tar.gz, calculate the hash and then compare against what the updated API will return.

The silly mistake I made was thinking I could just do this all internally when the state needs something to compare against (hence the source_code_hash attribute in the user configuration).

The other reason I was looking into this was that some people (for CI reasons) didn’t like having to specify a filename (e.g. have it exist on disk) and would prefer to point to a remote storage location like S3 or GCS, so I started looking into adding a url attribute, and I realised again that it didn’t make sense for a user to set source_code_hash in their configuration because in the case of a remote endpoint, they don’t have a local file on disk to hash with a function like filesha512.