Terraform v0.13 "Failed to instantiate provider" for every project

This is a recurring issue I’m running into when upgrading from v0.12.29 to v0.13 (v0.13.4 currently in this case). I just want to preface that running the terraform state replace-provider command DOES fix this, I’m here to try and find out why automatic upgrade does not work.

The snippets below are from a bigger project, but I want to note that I also see the same error on very simple projects with a single AWS resource.


I ran the terraform 0.13upgrade command and the only change it made was adding the required_providers to my versions.tf file.

terraform {
  required_version = ">= 0.13"
  required_providers {
    archive = {
      source = "hashicorp/archive"
    }
    aws = {
      source = "hashicorp/aws"
    }
    template = {
      source = "hashicorp/template"
    }
  }
}

Note that I have also tried this without the required_providers section.

My pipeline validates and inits fine, but when it plans, I get this error:

Releasing state lock. This may take a few moments...

Error: Could not load plugin


Plugin reinitialization required. Please run "terraform init".

Plugins are external binaries that Terraform uses to access and manipulate
resources. The configuration provided requires plugins which can't be located,
don't satisfy the version constraints, or are otherwise incompatible.

Terraform automatically discovers provider requirements from your
configuration, including providers used in child modules. To see the
requirements and constraints, run "terraform providers".

3 problems:

- Failed to instantiate provider "registry.terraform.io/-/archive" to obtain
schema: unknown provider "registry.terraform.io/-/archive"
- Failed to instantiate provider "registry.terraform.io/-/aws" to obtain
schema: unknown provider "registry.terraform.io/-/aws"
- Failed to instantiate provider "registry.terraform.io/-/template" to obtain
schema: unknown provider "registry.terraform.io/-/template"

##[error]PowerShell exited with code '1'.

I’ve seen people have this error specifically on Terraform Cloud (which I am not using). The solution was to remove the bad provider path (ex: “registry.terraform.io/-/aws”) and replace it with the good one (ex: “registry.terraform.io/hashicorp/aws”). For clarity, here is what the provider.archive looks like in state (some info redacted):

"resources": [
    {
      "mode": "data",
      "type": "archive_file",
      "name": "zip_function_name",
      "provider": "provider.archive",
      "instances": [
        {
          "schema_version": 0,
          "attributes": {
            "excludes": null,
            "id": "xxxxx",
            "output_base64sha256": "xxxxx",
            "output_md5": "xxxxx",
            "output_path": "./zip/my-file.zip",
            "output_sha": "xxxxx",
            "output_size": 721,
            "source": [],
            "source_content": null,
            "source_content_filename": null,
            "source_dir": null,
            "source_file": "./script/my-script-file.py",
            "type": "zip"
          }
        }
      ]
    },

For more clarity, here is the output of my terraform providers command:

Providers required by configuration:
.
├── provider[registry.terraform.io/hashicorp/archive]
├── provider[registry.terraform.io/hashicorp/aws]
├── provider[registry.terraform.io/hashicorp/template]
└── module.db
    ├── provider[registry.terraform.io/hashicorp/aws] >= 2.49.*, < 4.0.*
    ├── module.db_option_group
    │  └── provider[registry.terraform.io/hashicorp/aws]
    ├── module.db_parameter_group
    │  └── provider[registry.terraform.io/hashicorp/aws]
    ├── module.db_subnet_group
    │  └── provider[registry.terraform.io/hashicorp/aws]
    └── module.db_instance
        └── provider[registry.terraform.io/hashicorp/aws]

Providers required by state:

    provider[registry.terraform.io/-/archive]

    provider[registry.terraform.io/-/aws]

    provider[registry.terraform.io/-/template]

Like I said at the beginning, the following resolves my errors:

terraform state replace-provider "registry.terraform.io/-/aws" "registry.terraform.io/hashicorp/aws"
terraform state replace-provider "registry.terraform.io/-/template" "registry.terraform.io/hashicorp/template"
terraform state replace-provider "registry.terraform.io/-/archive" "registry.terraform.io/hashicorp/archive"

For reference, this changed the “provider” line to:
"provider": "provider[\"registry.terraform.io/hashicorp/archive\"]",


My question is why doesn’t the automatic upgrade work for us? From what I’ve read in the upgrade guide Terraform should be able to resolve the providers it knows about automatically. I’d expect it to know about aws and the like.

Do we have a v0.12 setup that caused state issues for everything?

Any help or input in figuring this out would be much appreciated. I really don’t feel like manually updating thousands of state files as issues pop up.

Thank you!

Hi @rymancl,

I know that some folks have encountered problems similar to this, where the automatic upgrade didn’t work, but there haven’t been enough examples yet to see any pattern in the causes, particularly because those with simpler architectures have been happy to use the manual workaround and not investigate further.

Based on what we know so far, I think the following information would be helpful to try to understand better what’s special in your situation that might cause the different behavior:

  • Run terraform init with the environment variable TF_LOG=trace and share that log output, e.g. in a gist. (I expect it will be too long to share inline in a forum comment.)
  • After terraform init succeeds, the full directory structure now under .terraform/plugins, which Terraform should just have populated. (For example, using either the tree or find commands.)
  • Run whatever command is returning the error you shared also with TF_LOG=trace and share that output in a similar way, so we can see how subsequent Terraform commands are understanding the on-disk result from terraform init.

It seems like for some reason the legacy provider plugins that Terraform wants to use to support the automatic upgrade can’t be found after installation, so the main thing I want to see here is why that might be. For example, if the installer is failing to put them there in the first place or if something is going wrong scanning the available plugins in later commands.

Thanks in advance if you are able to share these details! I hope it will lead to an explanation and perhaps also a fix. :crossed_fingers:

Hey @apparentlymart

Here is the TRACE logs you requested. Hopefully everything is acceptable for you, I had to redact a lot of personal and AWS info from them.

I started with the terraform 0.13upgrade and as is expected it updated my versions.tf file.


First…

terraform init -backend-config="backend.preprod.txt"

Second…

tree .\.terraform\plugins\ /f | Out-File tree.txt

Third…

terraform plan -var-file="preprod.tfvars"

This is where we get the error.

Error: Could not load plugin


Plugin reinitialization required. Please run "terraform init".

Plugins are external binaries that Terraform uses to access and manipulate
resources. The configuration provided requires plugins which can't be located,
don't satisfy the version constraints, or are otherwise incompatible.

Terraform automatically discovers provider requirements from your
configuration, including providers used in child modules. To see the
requirements and constraints, run "terraform providers".

Failed to instantiate provider "registry.terraform.io/-/aws" to obtain schema:
unknown provider "registry.terraform.io/-/aws"

Please let me know if I’ve missed anything or you would like any more details.

Thank you!

Thanks @rymancl! That’s very helpful as a starting point for further debugging.

What I can see in the output you’ve shared here is that for some reason terraform init isn’t trying to install those legacy providers at all. One reason that could happen, I think, is if they were used only in a non-default workspace. In your environment, do you use named workspaces and have providers in your selected workspace that aren’t used in the default workspace? (e.g. because the default workspace is unused has a totally empty state)

@apparentlymart that’s correct. We use named workspaces. I should have mentioned the terraform workspace select is done after the init, apologies for leaving that out.

We don’t use the default workspace at all; our automations expect a workspace name for environmental state separation.

Let me know if you need any more information on that!

EDIT: Is the issue here that the workspace selection needs to happen BEFORE the init?

EDIT 2: Confirmed selecting the workspace before initializing doesn’t help. Error:

Backend reinitialization required. Please run “terraform init”.
Reason: Initial configuration of the requested backend “s3”

Hi @rymancl,

I think the trick here is that the automatic upgrade flow is making a (shaky) assumption that it’s typical for all workspaces of a particular configuration will tend to use the same providers, and so I think it’s trying to use the default workspace as a proxy for what legacy providers might be needed for all workspaces. That isn’t working in your case, though.

I think switching workspaces before init would fail because it can’t access the backend to see which workspaces are available, but we might be able to “cheat” by using the environment variable override for workspace, which takes priority over what terraform workspace select chooses:

set TF_WORKSPACE=preprod
terraform init

(From the output you shared it seems like you’re using Windows, so I’ve used Windows-command-prompt-style environment variable setting above, but the general idea is to set the TF_WORKSPACE environment variable, whichever way you do it.)

I must admit that I’m proposing this without being certain that terraform init is actually equipped to respect that environment variable, since the environment variable is usually used primarily with terraform plan and terraform apply, but if it does work then that might give you a path to get the automatic upgrade working using a workspace other than default as the “model” for which legacy providers are needed.

@apparentlymart I think you may be onto something with the environment variable.

I started from scratch, set the TF_WORKSPACE, and re-ran the init. Here is a snippet of the portion I think you care about:

Here is the terminal output:

Initializing the backend...

Successfully configured the backend "s3"! Terraform will automatically
use this backend unless the backend configuration changes.

Initializing provider plugins...
- Finding latest version of hashicorp/archive...
- Finding latest version of hashicorp/aws...
- Finding latest version of -/aws...
- Installing hashicorp/archive v2.0.0...
- Installed hashicorp/archive v2.0.0 (signed by HashiCorp)
- Installing hashicorp/aws v3.12.0...
- Installed hashicorp/aws v3.12.0 (signed by HashiCorp)
- Installing -/aws v3.12.0...
- Installed -/aws v3.12.0 (signed by HashiCorp)

The following providers do not have any version constraints in configuration,
so the latest version was installed.

To prevent automatic upgrades to new major versions that may contain breaking
changes, we recommend adding version constraints in a required_providers block
in your configuration, with the constraint strings suggested below.

* -/aws: version = "~> 3.12.0"
* hashicorp/archive: version = "~> 2.0.0"
* hashicorp/aws: version = "~> 3.12.0"

Terraform has been successfully initialized!

This also produces quite a different tree:

terraform workspace show gives me the correct workspace from the env variable

preprod

without having to run terraform workspace select

Running a plan no longer gives the “Failed to instantiate provider” error.


It’s a bit janky, but it may work… granted, this was all toyed with locally. All of our Terraform goes through a standard, re-usable CI template in Azure DevOps but I’d expect to see the same result there after minor tweaks (like accounting for needing terraform workspace new).


Let me know your thoughts on this!

1 Like

@apparentlymart just a quick update of progress I’ve made…

I don’t think the TF_WORKSPACE env variable is going to work in our automation because of the case of a new project with a new workspace.
We have some logic like this in a template (pseudo-code):

terraform init w/ backend config

terraform validate

if terraform workspace exists in terraform workspace list
  terraform workspace select
else
  terraform workspace new

terraform plan

This isn’t working with the v0.13 migration since the init is run in the default workspace.


In this current template, if we set TF_WORKSPACE on the task template instead of using terraform workspace select, the first init will fail on a new project:

The currently selected workspace (some-workspace) does not exist.

If we put the init below the logic to determine if the workspace exists, we know that will fail with

Backend reinitialization required. Please run "terraform init".

because terraform workspace list requires backend initialization.


I didn’t want to mess around with setting/checking the env variable from within the script since we need this template to work across Windows and Linux.

So I re-wrote the script in the template to be like this (again, pseudo-code):

terraform init w/ backend config

if terraform workspace exists in terraform workspace list
  terraform workspace select
else
  terraform workspace new

terraform init -reconfigure w/ backend config

terraform validate

terraform plan

This allows us to get by with both

  • Making Terraform happy with the order of commands
  • Getting providers installed in the correct (non-default) workspace

I think I can get by with moving the second init inside the if after the terraform workspace select since we shouldn’t need to re-initialize on new projects since they will be v0.13 by default.


All said and done, this seems to allow us a clean migration (no plan complaints) from v0.12 to v0.13 when using non-default workspaces.

Again, I’m still curious to hear your input.

Thanks a ton for your time thus far! :slight_smile:

Hi @rymancl,

This last idea seems like a promising way to set it up so that it’ll just fix itself as you gradually upgrade. The idea of re-initializing after selecting a workspace is a good idea, since it can then pick up the same workspace name that terraform workspace select or terraform workspace new wrote out, instead of using the environment variable.

Doing the extra init only in the select case does seem reasonable, both for the reason you gave and also because a workspace you’ve just created with terraform workspace new won’t have any resources tracked in its state anyway, so there can’t possibly be anything new to install that wasn’t already covered by the first terraform init reading from the configuration.

If you haven’t set it up already you might want to consider enabling a provider plugin cache on your system so that Terraform can make sure it’ll always be able to re-use any already-downloaded providers. If you’re able to preserve the cache between runs then it’ll make even the first terraform init faster, but even if not it should help speed up the second terraform init.

With that said, once you’ve got everything upgraded to 0.13 this extra step should no longer be necessary (configuration is normally sufficient for detecting all of the required providers), so maybe not worth over-thinking it if you’ll be able to take this back out again after a little while.