Terraform Resource Schema standard naming convention

Is there any terraform project / plugin / convention that helps translating the schemas of similar resources from different providers?

  • e.g: think of google_compute_instance vs aws_intance, which can be considered the same type of resource (Virtual Machines in different cloud providers). And so, is there an automated way of unifying their resource schemas so it is easier to reason about?

POSSIBLE (MANUAL/NOT TOO ELEGANT) SOLUTION

The obvious way of doing it is coming up with your own unified schema and then translate both resources schemas into it. I want to know if there is any ongoing work where I can collaborate, before I start reinventing a wheel here.

  • The resource schema could look like this with Static Data Mapping:

      {
          "resource_type": {
              "udm_virtual_machine": {
                  // This number goes up after any changes in any of the values below
                  "version": 1,
                  "vendor_provider": {
                      "aws": {"resource_type": "aws_instance", "tf_schema_version": 2},
                      "google": {"resource_type": "google_compute_instance", "tf_schema_version": 6},
                      "azure": {"resource_type": "azurerm_linux_virtual_machine", "tf_schema_version": 0]
                  },
                  "udm_schema": {
                      "vm_type": {
                          "aws": "instance_type",
                          "google": "machine_type",
                          "azure": "size"
                      },
                      ...
                      "location_zone": {
                          "aws": "availability_zone",
                          "google": "zone",
                          "azure": "location"
                      }
                  }
              }
          }
      }
    

THE ADVANTAGES OF AN UNIFIED SCHEMA

The reason to have a somewhat “standard” schema is that the tfplan will produce a more similar structure between a google_compute_instance and a aws_intance. This “translated / similar” structure would make it much easier to run automated tests against the tfplan, such as Policy as Code. One could also use such unified schema separate from the tfplan.

Even if the translation happens separate from the providers, having a common schema will make it easier/automatable to compare similar resources from different providers.

SOME BACKGROUND

To appreciate how different the schemas of google_compute_instance and aws_instance are, you can compare them with the outputs of these commands:

terraform providers schema -json | jq '.provider_schemas."registry.terraform.io/hashicorp/aws".resource_schemas.aws_instance'

terraform providers schema -json | jq '.provider_schemas."registry.terraform.io/hashicorp/google".resource_schemas.google_compute_instance'

Even the .bloc.attributes section of the JSON structures are completely different. e.g: the property aws_intance.availability_zone is simply called google_compute_instance.zone.

I see other people have faced this problem before and at least this person wrote about it in Terraform Config for Multi-Cloud: Problem and Terraform Config for Multi-Cloud: Solution

  • The articles conclude that “[at the Schema layer] …unification is possible and also somewhat feasible…but it will likely not occur [because it has to be done at the provider level]”. They recommend to do parity at the configuration layer instead (by authoring terraform modules that call the individual cloud terraform providers). This does not ease writing Policy As Code rules and it puts the problem back to each developer, rather than offering a solution everyone can collaborate with.

I’m not aware of any such effort. My initial thoughts on this, are that it seems like a hard problem, because inevitably, different cloud providers will offer feature sets that don’t 100% correlate.

Furthermore, it’s something with a very large startup cost (detailed study of multiple cloud provider APIs, design, and implementation), that only really provides rewards when used to support multiple complex migrations or simultaneous multi-cloud hosting. As such, it’s particularly hard to imagine it being anyone’s open-souce “passion project” and equally pretty hard to see a way for a corporate entity to invest in and monetise.

On top of all this, many modern applications are probably able to dodge the issue, by using cloud-provider Kubernetes clusters, and Kubernetes as a common application definition language across the clouds.

I do understand why you’re interested in this… but to me the outlook for it happening seems rather bleak.

Thank you for sharing your thoughts. I can see the steep initial curve you mention.

Does Terraform currently require / has a way to categorize resources? As in, is there a way to quickly identify “all resources of the type Virtual Machines among all providers”?

Thank you,

No, there’s nothing like that. It’s up to each provider author to pick names for resources, and they’ll probably choose to match the terms used by each cloud provider, because that’s what users will expect.