Ignore Whitespace changes

Hey all
I wanted to know if you guys have any idea how to fix whitespace changes from showing up on terraform plan.
Right now we are using jsonencode(jsondecode(file(<file_name>))) to try and avoid whitespace changes from showing up. But still now working. Anyone has an idea how to avoid this?

Hi @JPSerras,

Can you say more about what happened when you tried the jsonencode(jsondecode(...)) solution you mentioned? I would also have expected that to work, so I’d like to know exactly what happened when you tried that in the hope that I can explain why it didn’t work, and thus suggest what to do instead.

Thanks!

Hey @apparentlymart

I can show the tf plan of one of the resources bellow.
To give a bit of context we have a folder with several schemas in json format, seperated by different files. Once there are changes, we would like to update only the resource corresponding to the schema modified. To do this we use a for_each loop that reads each schema and use that json encode to update it. It works like intended, but initially there were some file that randomly would appear in the tf plan with whitespaces problems. After trying to look for solution, i checked some people recommended the jsonencode(jsondecode(...)). With that change all resources appear to have now the whitespace problems. I applied it once because it was the first after the change. But once there was a new change in a schema file, it would still appear the whitespaces problem for every schema file.

  # aws_glue_schema.this["schemas/file1"] will be updated in-place
  ~ resource "aws_glue_schema" "this" {
        id                    = ""
      ~ schema_definition     = jsonencode( # whitespace changes
            {
                fields    = []
                name   = "test"
                type      = "record"
            }
        )
        tags                  = {
            "description" = "Schema registry"
            "lifecycle"   = "permanent"
        }
        # (10 unchanged attributes hidden)
    }

Hey @apparentlymart

Can you help us understand whats the issue here?

Any updates on this @apparentlymart ?

It seems like the AWS provider’s implementation of aws_glue_schema is missing a normalization rule here, causing it to treat the schemas as different even though the only difference is a meaningless JSON layout change.

The jsonencode(jsondecode(...)) trick can unfortunately only work if the provider and the remote API both agree that minified JSON is the normalized form. I would guess that in this particular case the underlying AWS API is returning the JSON in a non-minified form, the AWS provider is failing to detect an neutralize that change, and so Terraform Core thinks that the JSON whitespace decisions must somehow be meaningful in this case (because otherwise the provider should’ve not returned that difference).

Since Terraform doesn’t have a function for formatting JSON in exact the way that the AWS Glue API prefers it – whatever that is – I think the only real option here is to fix the missing normalization rule in the provider itself. Therefore I suggest to open a bug report issue in the AWS provider’s repository.

I think that the specific change required in the provider is to add what the Terraform Plugin SDK calls a “diff suppress func” for the schema_definition attribute, which returns true if the old and new values differ only in irrelevant JSON style choices, and also configure the SDK to apply that same rule when refreshing from the upstream API so that the API doesn’t just undo the normalization performed by the provider. In this case there’s an extra quirk that schema_definition can be either JSON or Avro depending on data_format, and so the provider might need to perform this extra check only when data_format = "JSON".

To work around this in the meantime would require you to hand-write the JSON in exactly the same way that the AWS Glue API is returning it, and I don’t know if that layout is even documented so I don’t think that’s a viable workaround.