"Could not load the schema for provider" for both AWS and AzureRM after init

I have a terraform project that I’m working on that I wanted to do automated testing against.

Locally, the testing framework works fine but it’s failing on my build/test server (both are ubuntu x64 systems).

I have checked and order of events on the server are:

  1. Check if the terraform folder and lock file exist, they don’t
  2. Run terraform init on ‘default’ workspace
  3. Check if the terraform folder and lock file exist, they do
  4. Swap terraform workspaces to my intended workspace and run init again
  5. Build a custom tfvars file
  6. Run apply on the workspace with the var file

The error I’m getting is:

[31mâ•·[0m[0m
[31m│[0m [0m[1m[31mError: [0m[0m[1mFailed to load plugin schemas[0m
[31m│[0m [0m
[31m│[0m [0m[0mError while loading schemas for plugin components: 2 problems:
[31m│[0m [0m
[31m│[0m [0m- Failed to obtain provider schema: Could not load the schema for provider
[31m│[0m [0mregistry.terraform.io/hashicorp/aws: failed to instantiate provider
[31m│[0m [0m"registry.terraform.io/hashicorp/aws" to obtain schema: Unrecognized remote
[31m│[0m [0mplugin message: 
[31m│[0m [0m
[31m│[0m [0mThis usually means that the plugin is either invalid or simply
[31m│[0m [0mneeds to be recompiled to support the latest protocol..
[31m│[0m [0m- Failed to obtain provider schema: Could not load the schema for provider
[31m│[0m [0mregistry.terraform.io/hashicorp/azurerm: failed to instantiate provider
[31m│[0m [0m"registry.terraform.io/hashicorp/azurerm" to obtain schema: Unrecognized
[31m│[0m [0mremote plugin message: 
[31m│[0m [0m
[31m│[0m [0mThis usually means that the plugin is either invalid or simply
[31m│[0m [0mneeds to be recompiled to support the latest protocol...
[31m╵[0m[0m

Both Error: Failed to load plugin schemas | Azure | Terraform and Terraform Error: Failed to load plugin schemas say to remove the old terraform folder but in this case it doesn’t exist prior to the init

I have tried using -upgrade and -reconfigure in the init as well as check if the terraform files exist prior to the init.

I tried executing the plugin directly after the init and am getting the following error:

This binary is a plugin. These are not meant to be executed directly. 
Please execute the program that consumes these plugins, which 
will load any plugins automatically

Any thoughts on what could be going wrong?

Hi @andre-qumulo! I’m the Martin Atkins that suggested you start this topic in the comments over in your Stack Overflow question.

As I was saying over there, it seems like for some reason the providers are crashing during initial startup, before they complete the handshake protocol with Terraform Core. That means Terraform Core isn’t getting any useful information about what went wrong and can only report this generic error that the plugin didn’t speak the protocol correctly.

Directly running the provider plugin executables and seeing that “This binary is a plugin” error message suggests that these are valid executables for your system, so I think we can rule out potential problems such as Terraform having somehow installed plugins for the wrong platform.

The next thing I’d like to try is a bit more “finicky”: we can try to create an environment that simulates how Terraform Core itself would run these executables and then try to see if they are generating any messages that we humans can understand even though Terraform Core itself can’t understand them.

On my system I’ve made a simple Terraform configuration that only contains a declaration that it depends on the AWS provider, just to help me get a plugin cache directory containing a hashicorp/aws executable so I can show a sequence of commands on my system which you can hopefully try on your system to see if you see a different result.

In the same directory where I ran terraform init I did the following:

  1. export TF_PLUGIN_MAGIC_COOKIE=d602bf8f470bc67ca7faa0386276bbdd4330efaf76d1a219cb4d6991ca9872b2

    This weird command sets up the special environment variable that Terraform provider plugins use to determine whether they are being launched by Terraform or not. Setting this in the shell before running the provider is basically lying to the provider executable and saying that our shell is Terraform Core, so that the provider will try to complete the protocol handshake rather than just immediately exiting with the “This binary is a plugin” error message.

  2. export PLUGIN_PROTOCOL_VERSIONS=5,6

    This will tell the provider plugin that our fake Terraform Core can speak both provider protocol versions 5 and 6. This matches the capabilities of current versions of Terraform and so should hopefully encourage the plugin to start up in the same way it would when running from inside Terraform Core.

  3. This next one is a bit more annoying because the plugin expects a multi-line string in this environment variable:

    export PLUGIN_CLIENT_CERT='-----BEGIN CERTIFICATE-----
    MIICUTCCAfugAwIBAgIBADANBgkqhkiG9w0BAQQFADBXMQswCQYDVQQGEwJDTjEL
    MAkGA1UECBMCUE4xCzAJBgNVBAcTAkNOMQswCQYDVQQKEwJPTjELMAkGA1UECxMC
    VU4xFDASBgNVBAMTC0hlcm9uZyBZYW5nMB4XDTA1MDcxNTIxMTk0N1oXDTA1MDgx
    NDIxMTk0N1owVzELMAkGA1UEBhMCQ04xCzAJBgNVBAgTAlBOMQswCQYDVQQHEwJD
    TjELMAkGA1UEChMCT04xCzAJBgNVBAsTAlVOMRQwEgYDVQQDEwtIZXJvbmcgWWFu
    ZzBcMA0GCSqGSIb3DQEBAQUAA0sAMEgCQQCp5hnG7ogBhtlynpOS21cBewKE/B7j
    V14qeyslnr26xZUsSVko36ZnhiaO/zbMOoRcKK9vEcgMtcLFuQTWDl3RAgMBAAGj
    gbEwga4wHQYDVR0OBBYEFFXI70krXeQDxZgbaCQoR4jUDncEMH8GA1UdIwR4MHaA
    FFXI70krXeQDxZgbaCQoR4jUDncEoVukWTBXMQswCQYDVQQGEwJDTjELMAkGA1UE
    CBMCUE4xCzAJBgNVBAcTAkNOMQswCQYDVQQKEwJPTjELMAkGA1UECxMCVU4xFDAS
    BgNVBAMTC0hlcm9uZyBZYW5nggEAMAwGA1UdEwQFMAMBAf8wDQYJKoZIhvcNAQEE
    BQADQQA/ugzBrjjK9jcWnDVfGHlk3icNRq0oV7Ri32z/+HQX67aRfgZu7KWdI+Ju
    Wm7DCfrPNGVwFWUQOmsPue9rZBgO
    -----END CERTIFICATE-----'
    

    If you paste this whole mess as a single string into the terminal where your shell is running then the shell should notice that the first line ends with an unterminated quoted string and so will wait for more lines of the string before returning to the shell prompt. The last line ends with the closing quote, and therefore ends the command.

    Terraform uses TLS to authenticate the connections between Terraform Core and the plugin so that other software on your system can’t tell the configured plugin to make requests using your credentials. The above is a random TLS certificate I found on the internet just so the syntax is valid enough for the plugin to be able to parse it; since we’re not actually running the plugin inside Terraform we won’t really be connecting to its server and so it doesn’t matter that we don’t actually know the private key associated with this cert.

  4. Finally we’re ready to actually run the plugin executable. This is the same as what you tried before but now that we have the above environment variables set the plugin should try to complete the protocol handshake, rather than exiting immediately:

    ./.terraform/providers/registry.terraform.io/hashicorp/aws/4.47.0/linux_amd64/terraform-provider-aws_v4.47.0_x5

    (You will probably have a different version installed than I do. Run whatever executable you ran before when trying this.)

    On my system the provider was able to start up successfully, and so after a short pause to generate the plugin’s own TLS certificate I see it produce some JSON log lines on stderr and then on stdout it writes the handshake message that Terraform Core would normally be waiting for:

    {"@level":"info","@message":"configuring server automatic mTLS","@timestamp":"2022-12-15T17:03:45.087834-08:00"}

    {"@level":"debug","@message":"plugin address","@timestamp":"2022-12-15T17:03:45.108090-08:00","address":"/tmp/plugin1434008061","network":"unix"}

    1|5|unix|/tmp/plugin1434008061|grpc|MIICTzCCAbCgAwIBAgIQNOSAMUVRBI3X6LcITGqHTzAKBggqhkjOPQQDBDAoMRIwEAYDVQQKEwlIYXNoaUNvcnAxEjAQBgNVBAMTCWxvY2FsaG9zdDAgFw0yMjEyMTYwMTAzMTVaGA8yMDUyMTIxNTEzMDM0NVowKDESMBAGA1UEChMJSGFzaGlDb3JwMRIwEAYDVQQDEwlsb2NhbGhvc3QwgZswEAYHKoZIzj0CAQYFK4EEACMDgYYABAG+aluw3GK7wsP+j0lLokGa+Jwa7AIaNThUZWXCug8ksQVxp5BPD9pm1T8Yt9Ja+5bxy2m9Zo0Q4ZokaWSrJa6GagHKToFZ1zAz6UywWhzn7pSTrmfGlXOMq5wQOS4JR2RB+98Po6l4Vd0h2Hp4lUwhZucjEsaZofwftw3j8jPAIa34B6N3MHUwDgYDVR0PAQH/BAQDAgKsMB0GA1UdJQQWMBQGCCsGAQUFBwMCBggrBgEFBQcDATAPBgNVHRMBAf8EBTADAQH/MB0GA1UdDgQWBBRQYKyQrVsEpH9faumpeX8g0XREdzAUBgNVHREEDTALgglsb2NhbGhvc3QwCgYIKoZIzj0EAwQDgYwAMIGIAkIBf9aLHzEtrdcDoyhVnwrQXYhLQuu9N4wQ01W+GA7h5w5Cx40tIiB91Ps+MR9ql+1l9tncZgHjAOA73UgzkYbJwEECQgHvf144ujKex4stmS8LRkA5hchOTi5fksbIQO3tRd6B1XuttpN+QdIzlV8u+qOpfckdmUMi+mo2ap9jb8fL7BtNkA

The last line I showed above is what Terraform Core was waiting for when it reported “Unrecognized remote plugin message”, and so I’m expecting that you’ll see something different happen at this point, probably involving some error messages written to stderr and nothing at all written to stdout (hence the “unrecognized message” being just an empty string in the error message.)

If you do see the above then you’ll find that the plugin will stay running in your shell and will stubbornly refuse to respond to Ctrl+C to terminate it, because it thinks it’s running as a child of Terraform Core and so is expecting to be shut down in a different way. I used Ctrl+\ (that is, a backslash) in my terminal to send SIGQUIT to the plugin and thus make it quit with an obnoxious set of stack traces, which are not important to this problem.

If you try this and see the plugin print out any other messaging instead of the handshake line then it would help me to see the full output the plugin printed, including any JSON log lines it might have emitted.

If you instead saw it return a similar handshake message like above and had to quit it explicitly with Ctrl+\ then I’d like to know about that too. Unfortunately that won’t give us much new information but at least I can try to think of a different strategy to reproduce how the plugin is crashing.