Data source remote state 0.13 in 0.11 module

Hey all!

Mistakes were made. A shared remote state was upgraded to 0.13 and we discovered a module who’s state was stored with 0.11.13. We get “Error: cannot decode dynamic from flatmap” or if we use 0.11 the state version incompatibility error “Error refreshing state: state snapshot was created by Terraform v0.13.5, which is newer than current v0.12.29; upgrade to Terraform v0.13.5 or greater to work with this state”

Is there a method for upgrading the remote state JSON manually? (0.11 -> 0.12 -> 0.13)

If not, is there a sane upgrade path for the remote state in this situation? I ask because now that it’s happened I realize there’s almost certainly more of these 0.11 remote states.

I’ve tried a few things already.

  • pull state, change version, upload to backend, and apply. Tries to re-created all resources anew.
  • Re-init with 0.11 or 0.12 and apply with a codebase pulled from git history. Similar to above.

Next ideas:

  • Use 0.12 and insert static values in the code manually to replace remote_state usage (ugh)
  • Roll remote state back to 0.12, roll code back to 0.11, apply and deal with the fallout of new resources that were added since 0.13 upgrade

Would really appreciate any tips! Obviously, we’re correcting our terraform upgrade process to include an apply… even if there were no state changes. :man_facepalming:

Thanks

Hi @bryanspears,

I must admit I’m not totally sure from your description exactly what situation you’re in, so I’m not sure exactly what to suggest, but I do have some information that may be helpful in plotting out some next steps:

  • The Terraform v0.12 format for state snapshots is significantly different than the Terraform v0.11 format, because Terraform v0.12 supports new value types that were just not possible in v0.11. It seems like you may have encountered such a situation, because “dynamic” is an internal name for the capability of a value to have a type determined dynamically at runtime rather than statically in a schema, which doesn’t have any reasonable representation in Terraform v0.11’s “flatmap”-based state snapshot format (which can only really store strings and collections of strings).

    For that reason, I’d not be optimistic about any sort of automatic “downgrade” of the v0.13 state to be compatible with v0.11. There is no built-in mechanism to do it and doing so would always be lossy.

  • Later versions of Terraform v0.11 do have some partial support for reading the Terraform v0.12.x state format in terraform_remote_state only, in an attempt to support situations like the one you seem to have encountered, but it can only work as long as the output values from the upgraded state stick to the subset of types that Terraform v0.11 understands (strings and collections of strings).

    Because the forward-compatibility accommodation in Terraform v0.11 was only designed as a temporary transitional aid for moving from v0.11 to v0.12, it may not be compatible with state snapshots generated by Terraform v0.13 or later. The Terraform team didn’t intentionally break it, but we didn’t actively test for it either because our upgrade processes generally expect you to move only one major version at a time.

    With that said though, the Terraform v0.13 and v0.12 state formats are much more similar than v0.12 compared to v0.11, because v0.13’s changes were much more focused (related to resource/provider associations), so it may be more reasonable to either manually adapt a v0.13.x state snapshot to be similar enough to a v0.12 one to be readable by later v0.11.x, or (if your state file is too complicated for manual adjustments) write a one-off script to read in the JSON and tweak it to not use any v0.13-specific constructs.

  • The terraform_remote_state data source only actually uses the root module outputs of the state snapshots it retrieves, so it may be possible to temporarily create an artificial v0.11-shaped state snapshot that only contains the output values you need in your v0.11-based configurations and use that artificial state snapshot instead of the “real” remote state until you’re able to get everything upgraded to v0.12 and then v0.13.

    For example, if you are using the S3 backend then you could create this sort of “artificial” remote state snapshot using the aws_s3_bucket_object resource, making it hopefully look enough like a remote state that v0.11’s terraform_remote_state can be tricked into reading it:

    locals {
      fake_outputs = tomap({
        a = "example"
        b = "example"
      })
    }
    
    resource "aws_s3_bucket_object" "example" {
      bucket = "example"
      key    = "fake-0.11-state/terraform.tfstate"
    
      content = jsonencode({
        version           = 3
        terraform_version = "0.11.0"
        serial            = 1
        lineage           = "fake"
        modules = [
          {
            path = ["root"]
            outputs = {
              for k, v in local.fake_outputs : k => {
                type  = "string"
                value = v
              }
            }
          },
        ]
      })
    }
    

    You could potentially include something like the above in your configuration that you’ve now inadvertently upgraded to v0.12 or later (I think that’s what you’re dealing with?) so that terraform apply on that configuration will then update both the real state snapshot’s outputs and these “fake outputs” for the v0.11 configurations to read.

I hope there’s something useful in the above. If any of this seems promising and you’d like to hear more about it, please let me know.

Thank you for your thorough response! Obviously, my OP was not very clear. To put it more succinctly:

I have some 0.12.x terraform code that depends on separate 0.13.x remote state and itself has 0.11.x remote state.

Yeah, it’s a mess. Your fake state idea might work, but I’ve fiddled with remote state before in similar situations and had issues with the dynamodb lock hash.

Maybe the simpler explanation above will trigger some thoughts.

@apparentlymart Your fake state suggestion worked! Thank you!

For future reference: I stored some fake 0.11 state output for a shared module on S3 and temporarily loaded that as part of an 0.12 apply to bridge to 0.13. This avoided 0.12 refusing/failing to load the 0.13 state that was in production.