Module variable(s) best practice

Hello everyone!

I’m currently in a tricky situation where it’s about the design decision of variables in Terraform modules which will have an impact on the complete setup. As a consultant I have a project for a customer where I have to make Terraform modules available in a global repository. I already have a working setup with about 8 modules. My implementation using e.g. storage account looks like this:

# variables.tf
variable "name" {
  type        = string
  description = "Name of the storage account"
}

variable "resource_group_name" {
  type        = string
  description = "Name of the resource group"
}

variable "subnet_id" {
  type        = string
  description = "Id of the subnet"
}

variable "location" {
  type        = string
  description = "Name of the location"
}

variable "tags" {
  type        = map(string)
  description = "Tags for resources"
}

variable "is_hns_enabled" {
  type        = string
  description = "Enable hierarchical namespaces (Synapse = true, Machine Learning = false)"
}

variable "account_replication_type" {
  type        = string
  description = "Replication type of storage account"
}

variable "privatelink" {
  type        = map(object({
    id   = string
    name = string
  }))
  description = "Object of privatelink ids"
}

variable "private_dns_zone_group_name" {
  type        = string
  description = "Name of the Global Private DNS Zone Group"
}

variable "deploy_private_endpoints" {
  type        = bool
  description = "Flag if Private Endpoints should be deployed"
}
# main.tf of deployment definition
module "storage" {
  source                      = "./modules/storage"
  name                        = "st${replace(var.name_suffix, "-", "")}"
  resource_group_name         = var.target_ressource_group_name
  account_replication_type    = "RAGRS"
  is_hns_enabled              = true
  subnet_id                   = var.subnet_id
  location                    = var.location
  deploy_private_endpoints    = var.deploy_private_endpoints
  privatelink                 = var.privatelink
  private_dns_zone_group_name = var.private_dns_zone_group_name
  tags                        = local.tags
}

It is planned that my work will be taken over by internal employees. The customer has hired a DevOps engineer with whom I am now supposed to work. Unfortunately, he has a completely different idea of how variables should be used in modules. His implementation looks like this:

variable "storage" {
  description = "Storage Account object that is passed to module"
  type = object({
    name                     = string
    resource_group_name      = string
    subnet_id                = optional(string)
    private_endpoints        = optional(set(string), [])
    location                 = optional(string, "westeurope")
    account_replication_type = optional(string, "LRS")
    is_hns_enabled           = optional(bool, false)
    tags                     = optional(map(string), { "maintainer" = "terraform" })
    container_names          = optional(set(string), [])
  })
  validation {
    condition     = can(regex("[a-z0-9]+", var.storage.name))
    error_message = "var.storage.name must meet regex conditon /[a-z0-9]+/"
  }
  validation {
    condition     = can(regex("[A-Za-z0-9-]+", var.storage.resource_group_name))
    error_message = "var.storage.resource_group_name must meet regex conditon /[A-Za-z0-9-]+/"
  }
  validation {
    condition     = can(regex("(West Europe|westeurope|Sweden Central|swedencentral)", var.storage.location))
    error_message = "var.storage.location must meet regex conditon /(West Europe|westeurope|Sweden Central|swedencentral)/"
  }
  validation {
    condition     = can(regex("(LRS|GRS|RAGRS|ZRS|GZRS|RAGZRS)", var.storage.account_replication_type))
    error_message = "var.storage.name must meet regex conditon /(LRS|GRS|RAGRS|ZRS|GZRS|RAGZRS)/"
  }
  validation {
    condition     = can(contains(["blob", "dfs", "file", "queue", "table", "web", ""], var.storage.private_endpoints))
    error_message = "var.storage.private_endpoints set must contain /(blob|dfs|file|queue|table|web)/ or be empty"
  }
  validation {
    condition     = can(regex("(^/subscriptions/.*|^$)", var.storage.subnet_id)) || var.storage.subnet_id == null
    error_message = "var.storage.name must meet regex conditon /^\\/subscriptions\\/.*/ or be null"
  }
  validation {
    condition     = can(regex("(true|false)", var.storage.is_hns_enabled))
    error_message = "var.storage.is_hns_enabled must be either true or false."
  }
}
# main.tf of deployment definition
locals {
  storage = {
    name                     = "abcd"
    resource_group_name      = "xypz"
    subnet_id                = "/subscriptions/xxx/resourceGroups/xxx/providers/Microsoft.Network/virtualNetworks/xxx/subnets/xxx-subnet-default"
  }
}

module "storage" {
  source  = "../storage"
  storage = local.storage
}

As you can see, the question is whether variables should be explicitly defined in a module or whether a variable should be defined as an object. My position on this is clear as I would like to adhere to this documentation Input Variables - Configuration Language | Terraform | HashiCorp Developer.

I would be very grateful if you could give me an assessment with a brief explanation of why one variant is preferable to the other.

Thank you very much

In the end it really comes down to preference. The validations and defaults could all be added to the individual variable values, and there are no complex nested optional attributes, so functionally they would work the same.

One consideration is that if the input data tends to also be in the form of a single object, assigning it to a single input parameter can be more convenient. Similarly, if the variable value is always used as a specific single object type on assignment, having the full type specified can be useful. Another point is if different inputs have extensive descriptions, those docs can be rendered independently, but in many cases the user may be more inclined to look at the source and comments anyway.

(and unrelated, a boolean value can only be true or false, there no reason to convert it to a string to validate the string only contains "true" or "false")

Hello @jbardin

Thank you for your assessment. I agree with you that there are cases where single input parameter can be more convenient. For modules where several resources are used together, it would improve the overview and usability.

On the other hand, I see too many disadvantages to using this approach for individual resource modules. Our goal is to make the modules available company-wide and not to develop them for a specific case. Even inexperienced users should be able to find their way around easily. Unfortunately, I also see that users could then use the modules as follows

module "storage" {
  source  = "../storage"
  storage = {
    name                     = "abcd"
    resource_group_name      = "xypz"
    subnet_id                = "/subscriptions/xxx/resourceGroups/xxx/providers/Microsoft.Network/virtualNetworks/xxx/subnets/xxx-subnet-default"
  }
}

We also use the Visual Studio Code Terraform plugin. In the event of errors in the complex type, for example if a mandatory variable is not set, the plugin displays “No declaration found for "local.storage”. On the other hand, the plugin shows the error “Required attribute “resource_group_name” not specified: An attribute named “resource_group_name” is required here” if the variable is explicitly defined.

Do you know if the plugin can also give more precise error messages for complex types?

The editor plugin is probably not going to produce any errors more accurately than the terraform cli itself (and probably takes the error directly from the CLI), though it may be able to include some extra hints in the process. Both the errors you mentioned seem reasonable depending on their context.