How to set state of provider passed to module?

Overview

I imported all of the AWS resources for each AWS region with Terracognita in to a per-region terraform.tfstate file, in isolation. I am now trying to merge all of those state files, along with the Terraform configuration that came with it. Unfortunately I have to do this because Terracognita doesn’t work with multiple AWS regions simultaneously.

So, I have to modify the terraform.tfstate file manually after invoking terraform state mv on each resource from each region. The question is what to se the provider, type, and name JSON properties for each resource, after creating a module for each AWS region that gets instantiated with its own aliased aws provider.

Details

If I have the following Terraform module definition:

# module_instances.tf
module "provider_us-east-1" {
  source    = "./us-east-1"
  providers = {
    aws = aws.us-east-1
  }
}

And the following provider definition:

# provider.tf
provider "aws" {
  alias = "us-east-1"
  region = "us-east-1"
}

And invoking terraform plan says this:

  # aws_vpc.vpc_0b00b33f will be destroyed
  # (because aws_vpc.vpc_0b00b33f is not in configuration)

  # module.provider_us-east-1.aws_vpc.vpc_0b00b33f will be created
  + resource "aws_vpc" "vpc_0b00b33f" {
    ...
    }

What do I need to change in the terraform.tfstate file to make sure the VPC doesn’t get destroyed and recreated?

Changing the provider field in the VPC resource in the terraform.tfstate JSON to different values has varying effects:

Changing to this value redundantly creates the resources and also emits a couple of errors:

"provider": "provider[\"registry.terraform.io/hashicorp/aws\"]",

Errors:

│ Error: Invalid provider configuration
│
│ Provider "registry.terraform.io/hashicorp/aws" requires explicit configuration. Add a provider block to the root module and
│ configure the provider's required arguments as described in the provider documentation.
│
╵
╷
│ Error: Invalid AWS Region:
│
│   with provider["registry.terraform.io/hashicorp/aws"],
│   on <empty> line 0:
│   (source code not available)

Changing to this value causes all resources to be destroyed and recreated:

"provider": "module.provider_us-east-1.provider[\"registry.terraform.io/hashicorp/aws\"]",

Setting this value prevents any resources from getting destroyed, but redundantly creates resources that already exist:

"provider": "module.provider_us-east-1",

Setting this value emits the Error: Provider configuration not present error:

"provider": "module.provider_us-east-1.provider[\"registry.terraform.io/hashicorp/aws\"].us-east-1",

Should I instead change one of these JSON fields in the terraform.tfstate JSON?:

      "mode": "managed",
      "type": "aws_vpc",
      "name": "vpc_0b00b33f",

Hi @nhooey,

The behavior you’ve described – Terraform proposing to destroy and existing object and create a new one instead of just renaming it to a new address – doesn’t seem like something I would expect to be caused by provider configuration references in the state.

The typical way to avoid that sort of recreation would be to add a moved block that commemorates the fact that you refactored that resource into a child module:

moved {
  from = aws_vpc.vpc_0b00b33f
  to   = module.provider_us-east-1.aws_vpc.vpc_0b00b33f
}

With the above statement included in your root module, you give Terraform a little more information to work with while it’s planning. It should then notice that you have an object in your state bound to aws_vpc.vpc_0b00b33f (the from address) and a resource in your configuration at module.provider_us-east-1.aws_vpc.vpc_0b00b33f (the to address), and so just report that the object has moved rather than proposing to recreate it.

Applying that plan would then update the bindings in the state to match the new structure in the configuration. If necessary, Terraform will also automatically update the "provider" properties in the state to match the configuration. In this case, I would expect Terraform to notice that the resource is associated with provider["registry.terraform.io/hashicorp/aws"].us-east-1 and so update the state to match that.

The configuration is authoritative on which provider each resource belongs to, with the record in the state only being used as a fallback for when you remove the resource block altogether while Terraform still needs to figure out how to plan to destroy the associated object. You should therefore not typically need to directly manipulate the state; it should be sufficient to update the provider references in the configuration and then let terraform apply automatically update the state to match.

The issue was that I had to set the module key in the terraform.tfstate JSON file, at the same level as the type and name keys of the resource. Without the module key set to module.name_of_module, Terraform will think it has to destroy and recreate the resource in question.

It would be great if overall Terraform was less of a black box so it would be easy to see what the problem is, or if at least the error messages weren’t so terrible. Having some raw terraform.tfstate files and their differences after various operations would also be really helpful.

Hi @nhooey,

I’m glad to hear that you found a path forward.

Unfortunately I think the root problem here is that you manually modified the state file into a shape that Terraform would not have created itself and therefore Terraform didn’t know what to make of what it found in that artifact.

The state snapshots are intended as something managed by Terraform itself, and while you certainly can try to modify them directly it has similar implications to e.g. modifying the on-disk representation of a database directly rather than using the data manipulation API: you are then working at a level where there are no guardrails and you need a pretty complete understanding of how the database software lays out data on disk.

The fact that Terraform stores its internal data structure as JSON rather than in an opaque binary format is convenient for inspection and debugging, but I would not recommend manually modifying state snapshots before becoming familiar with exactly how Terraform interprets that format, and which combinations of values are possible for Terraform to have generated itself.