Terraform for a resuable template in hybrid cloud

Hello, please i need your professional advice and perhaps some links will be better. I am trying to create a resuable code in form of a template. I have a folder containing some files single for both azure & vsphere [main.tf,provider.tf,variables.tf & tf.vars} of terraform and i already declared the necessary variables needed in them.

So, What i need to achieve is that whenever i want to create a resources, i can decide on the type of platform that i want either vsphere or Azure and the resource(s) that i need lets say CPu=4, Disk=30 and so on and this should not alter each environment. I thought of using ansible as the configuration mgt tool using jinja2 template however, i seems not to enjoy the outcome.

some people already recommended terraform SDK but i am not certain whether it will solve my problem because I am relatively new to using typescript. I will really appreciate any form of advice in this regards.

many thanks

Hi @yomofo2s,

The typical approach for a situation like yours is to decide what common abstraction you intend to provide across both underlying platforms and then use that to write a family of modules that each provide a different implementation of that abstraction for one platform.

Then in your configurations which concretely define particular components, you can decide which of the particular modules to use depending on which platform that component is targeting.

You mentioned CPU and disk which makes me suspect that the common abstraction you’re aiming to provide is virtual machines. Based on that I’ll try to give a high-level example of what I mean. I’m intending this just to show you how I’d go through the design process of solving this problem, not to propose an actual solution. My suggestion is that you follow a similar design process based on your greater awareness of the requirements you need to meet and thus produce a result that suits those requirements.

My first step here would be to think about what characteristics of virtual machines are the same and which are separate across the different platforms I need to support. I’m not an expert in either vSphere or Azure but from what I do know I’d identify the following as characteristics the two platforms have in common:

  • Virtual machines have disks of a particular size.
  • Virtual machines have a “machine type” which encapsulates details such as RAM size and CPU. The exact combinations available vary between platforms but it’s possible to identify similar machine types across multiple platforms to abstract over that to some extent.
  • Virtual machines are each allocated a private IP address by the platform’s virtual network fabric.

There are also some characteristics that are harder to abstract over, due to structural differences:

  • vSphere’s networking model is about virtual switches and ports, whereas Azure’s networking model is at the IP level.
  • Both systems have some idea of a “machine image”, but they are structured quite differently and have different requirements such that it wouldn’t really make sense to share a single image across both platforms.

I’m sure there are other examples of common characteristics and mismatching characteristics, but since this is just an example I’m going to focus on the ones above.

My next step then would be to think about a consistent way to specify the common elements across both platforms. Two modules using a similar methodology means it’s easier to combine them with other components that depend only on the common elements. In this case we need some way to define a machine type and a disk size as input. Inside a module call that might look like this:

module "example" {
  source = "./modules/just-an-example"

  machine_type = "large"
  disk_size_gb = 8
  # ...
}

The disk_size_gb is relatively straightforward because the definition of a “gigabyte” is standard across both platforms. The machine_type argument is more interesting because we need to define some common high-level terminology that we can then map to the specific details of each platform. I used “large” above as an example abstraction; let’s imagine that our new abstraction supports the following two machine types, and maps them on to vSphere and Azure as shown:

machine_type vSphere Azure
small 1 CPU, 2GB RAM Standard_A1_v2
large 4 CPU, 8GB RAM Standard_A4_v2

Each of our modules can also produce a standard output value private_ip which returns the primary private network IP address for the virtual machine it declared.

So now let’s see what a module implementing this common interface for vSphere might look like:

variable "machine_type" {
  type = string

  validation {
    condition     = contains(["small", "large"], var.machine_type)
    error_message = "Machine type must be either 'small' or 'large'."
  }
}

variable "disk_size_gb" {
  type = number
}

variable "vsphere" {
  type = object({
    name                 = string
    resource_pool_id     = string
    datastore_cluster_id = string
    guest_id             = string
    network_id           = string
  })
}

locals {
  machine_types = tomap({
    small = {
      num_cpus = 1
      memory   = 2048
    }
    large = {
      num_cpus = 4
      memory   = 8192
    }
  })
}

resource "vsphere_virtual_machine" "main" {
  name                 = var.vsphere.name
  resource_pool_id     = var.vsphere.resource_pool_id
  datastore_cluster_id = var.vsphere.datastore_cluster_id

  num_cpus = local.machine_types[var.machine_type].num_cpus
  memory   = local.machine_types[var.machine_type].memory
  guest_id = var.vsphere.guest_id

  network_interface {
    network_id = var.vsphere.network_id
  }

  disk {
    label = "disk0"
    size  = var.disk_size_gb
  }
}

output "private_ip" {
  value = vsphere_virtual_machine.main.default_ip_address
}

And here’s an equivalent module for Azure:

variable "machine_type" {
  type = string

  validation {
    condition     = contains(["small", "large"], var.machine_type)
    error_message = "Machine type must be either 'small' or 'large'."
  }
}

variable "disk_size_gb" {
  type = number
}

variable "azure" {
  type = object({
    name                 = string
    resource_group_name  = string
    location             = string
    network_interface_id = string
    source_image = object({
      publisher = string
      offer     = string
      sku       = string
      version   = string
    })
  })
}

locals {
  machine_types = tomap({
    small = "Standard_A1_v2"
    large = "Standard_A4_v2"
  })
}

resource "azurerm_linux_virtual_machine" "main" {
  name                = var.azure.name
  resource_group_name = var.azure.resource_group_name
  location            = var.azure.location
  size                = local.machine_types[local.machine_type]
  admin_username      = "adminuser"
  network_interface_ids = [
    var.azure.network_interface_id,
  ]

  admin_ssh_key {
    username   = "adminuser"
    public_key = file("~/.ssh/id_rsa.pub")
  }

  os_disk {
    caching              = "ReadWrite"
    storage_account_type = "Standard_LRS"
    disk_size_gb         = var.disk_size_gb
  }

  source_image_reference {
    publisher = var.azure.source_image.publisher
    offer     = var.azure.source_image.offer
    sku       = var.azure.source_image.sku
    version   = var.azure.source_image.version
  }
}

output "private_ip" {
  value = azurerm_linux_virtual_machine.main.default_ip_address
}

There’s a log in these examples, so here are the main things to notice:

  • Both modules have identical definitions for variable "machine_type" and variable "disk_size_gb", because those are the arguments for our common abstraction.
  • Both modules have an output "private_ip" that returns an IP address for the instance that was created, so consumers of the module can treat the two as equivalent as long as all they need is an IP address.
  • Each module has its own local mapping table from the abstracted machine_type to whatever values are needed to implement that machine type in the particular platform. For vSphere that’s separate num_cpus and memory values, while for Azure that’s just a mapping to one of the predefined machine size SKUs.
  • Each module also has a separate variable to contain the parameters that don’t fit into the abstraction, and must thus be defined differently for each platform. The biggest part of designing an abstraction like this is making the tradeoff for what makes sense to be abstracted and what doesn’t; I was focusing only on machine type and disk size as the common elements here, so I put everything else in the platform-specific object. You might make a different tradeoff in your real system.

Now in each of your configurations that need a virtual machine of a particular type you can decide which of the two modules to instantiate depending on which platform(s) that configuration is targeting:

module "vm" {
  source = "./modules/vm/vsphere"

  machine_type = "large"
  disk_size_gb = 16
  vsphere = {
    # ...
  }
}
module "vm" {
  source = "./modules/vm/azure"

  machine_type = "large"
  disk_size_gb = 16
  azure = {
    # ...
  }
}

Most interestingly though, both of these modules produce an output value private_ip with an equivalent meaning, so if you have some other module that uses a VM created elsewhere and only accesses it by IP address then you can design that module so that it’d be compatible with either of these “vm” modules:

module "uses_vm" {
  source = "./modules/example-uses-vm"

  vm = module.vm
}

Inside this hypothetical “example-uses-vm” module you can declare a vm variable like this to be compatible with both modules:

variable "vm" {
  type = object({
    private_ip = string
  })
}

Then elsewhere in that module you can use var.vm.private_ip to refer to that IP address regardless of which of the two platform-specific VM modules created it.

This sort of approach of defining a common abstraction layer over multiple platforms using multiple modules that all produce compatible outputs is the usual way to achieve “multi-cloud” with Terraform. Of course, it’s most productive if your system is layered in a way where higher-level components only depend on the abstractions and not on the details. In our example above, that means that the only way to use an abstracted VM is to connect to its private IP address, because all other details about the VM are hidden inside the abstraction module.

What I’ve written out here is a worked example of the guidance in the Multi-cloud Abstractions section of the Module Composition guide. The guide includes some other examples of opportunities to create this sort of abstraction, including the idea of abstracting over various hosted implementations of Kubernetes where the result in all cases is a Kubernetes API endpoint, regardless of what is actually serving that endpoint.

I hope that helps you to see how you can approach a multi-cloud design exercise for your own real system!

1 Like

Hi @apparentlymart
You are right, I aim at creating a virtual machine on the 2 platforms. You are also right about the combinations of the resources which varies between the 2 platforms(what a pain). This recommendation worked both on Azure and vSphere, however, the only hurdles that i have is to seperate the modules in a different folder because the 2 modules cannot be in the same .tf file.
Same measure is applicable when calling the modules from my main.tf file. Eventually, i created a tfvars where my secrets are…TL;DR. My master plan is to have a code that is reusable for the 2 providers. Could there be another way that i can terraform apply that i could get a prompt for me to select either Azure or vSphere platform?, thereby the deployment will begin. If otherwise, the both environment at the same time will be deployed.
Meanwhile, your earlier suggestion perfectly worked. Any form of suggestion would also be great!

Hi @yomofo2s,

Typically the way we select between different Cloud platforms is by changing the configuration itself to refer to another cloud platform. Using modules in the way I described serves only to make the necessary changes smaller and, if the architecture allows for it, to allow you to still share some common functionality between platforms even if the foundational elements are not.

There isn’t any way to dynamically choose a module based on input variables, or any other per-run setting. Terraform’s model is that the configuration describes what should exist, and root module input variables are typically only for a small number of settings that need to vary from one run to the next.