Migrating *portion* of state from 0.11 to 0.15

Hi I have some AWS infrastructure that is still managed via terraform 0.11. I cannot move all the infra to 0.15 but I can move some of it out into a separate state managed by 0.15. In case it helps clarify, currently I have VPC + EKS + (RDS + security group for RDS), in terraform 0.11; and cloudfront + s3 in terraform 0.15 state; I wish to move the (RDS + security group for RDS) to the separate state managed by terraform 0.15. Once done, the only thing managed by tf 0.11 will be VPC + EKS.

What is the recommended approach? I already have the terraform 0.15 HCL code extended to create the RDS + RDS SG with same names etc. I considered following approaches:

  1. use terraform015 state mv --state tf011.tfstate --state-out tf015.tfstate RESOURCENAME: this does not work because terraform 0.15 only understands back to 0.12
  2. use terraform012 state mv --state tf011.tfstate --state-out tf012.tfstate RESOURCENAME, followed by terraform015 state mv --state tf012.tfstate --state-out tf015.tfstate RESOURCENAME: I doubt this will work, I remember moving some state to 0.14 at some point and required first going to 0.13, etc; also along the way, I needed to migrate the HCL code itself although that might have been from 0.11 to 0.12 only
  3. make a copy of the HCL, pull state from s3, verify no plan changes, migrate state from N to N+1, N=11 to 14 with terraform version 0.(N+1) and --state-out to the terraform 0.15 state file, discard pulled state and temp HCL
  4. for each resource to move: get its ARN, use terraform011 state rm RESOURCENAME followed by terraform015 import RESOURCENAME SAVED_ARN
  5. Other?

How many resources are you talking about here? Frankly it might be easier to script importing all the existing resources than going thru each version if its for a small amount of things.

You can’t do anything with 0.11 except move to 0.12. To get to 0.15/1.0, you’ll have to again pass thru 0.13 then can go to 0.15… as long as there isn’t anything crazy/odd in your HCL.
It should be relatively quick to iterate thru the versions - so option 3 makes me think you tried this and it didnt work?

Have you seen GitHub - GoogleCloudPlatform/terraformer: CLI tool to generate terraform files from existing infrastructure (reverse Terraform). Infrastructure to Code ?

I did not try option 3 yet. I’m leaning towards remove+import like your first suggestion because although there’s about 24 resources, there are 6 groups of 4 so a loop would handle all.

I have used terraformer, but how does this help? Terraformer generates basic HCL for existing resources. You then have to do a fair bit of massaging of what it “reverse engineered”. I already have the HCL2.

I guess one thing not clear is why you can only move a portion from 0.11. You don’t want anything 11/12 at this point if you can’t help it.

Yeah a full stack upgrade would be a lot of stuff, and I will have to document the upgrade procedure so it can be repeated in prod and other stacks. So I’d rather do the separation first as it is much more manageable.

Given the intricacy here of moving a specific subset of instances into a separate state, I think honestly I’d do this by pulling the state file locally, copying it to another local file, and then editing both to delete whole resources so that each resource only ends up in one of the two files. Then push one of them back up as-is to be the state for the configuration that will continue on v0.11, and push the other one somewhere else and run through the v0.12 upgrade process on the combination of that new state file and the relevant split configuration.

I say this because ultimately all that terraform state rm and terraform state mv are doing is removing or moving JSON objects around inside the state snapshot. They can work for simple situations involving different addresses in one state, but the -state and -state-out arguments have honestly never worked well (if you are using a remote backend I don’t think they even work at all!) and I would not suggest trying to use them.

Of course there is some risk here of making the resulting state snapshots invalid in some way, or accidentally leaving the same object in both files. These are the risks that the built-in commands typically protect against, but I think once you have the state files loaded you’ll be able to infer quickly how to edit them in a way that Terraform will accept. You could practice on a throwaway state file for a scratch configuration containing only null_resource resources if you like. If you try this and have some questions then I’d be happy to try to answer them.

The only special note I’d make is that it’d be best to change the top-level lineage property in whichever of the two files is going to be considered the “new” state. Terraform won’t raise any flags if you leave them the same, but Terraform uses this as part of a safety check to avoid accidentally applying a plan to the wrong state, and so leaving them both the same will effectively disable that safety check for this particular pair of states.

Thanks guys for your input. I dread manual changes because when you have to apply to many stacks and resources, your chance of mistake is high. Also the json is slightly different and I would end up spending quite a bit of time figuring out what changes are significant.

So I used a bit of python code to find the desired resources and their ID using tf 0.11, import them in target using 0.15, and finally remove the associated tf files (luckily all files standalone) and the resources from source state using terraform state rm.

Not trivial but wasn’t too bad either. Some gotchas:

  • The RDS identifier_prefix, although they are in the HCL2 at the time I run the import, are ignored and end up null, so I had to make the python script load the tfstate json and find the identifiers and set them.
  • The security group rules were a bit of a pain to figure out for import since they don’t have an ID, need to know quite a bit about the rule itself but wasn’t too bad either.