Any method to measure code duplication in terraform?

I frequently encounter terraform code that looks like this:

resource "aws_s3_bucket" "dq-args-staging" {
  bucket        = "dq-args-staging"
  force_destroy = false
  versioning {
    enabled    = false
    mfa_delete = false
  }
  server_side_encryption_configuration {
    rule {
      apply_server_side_encryption_by_default {
        sse_algorithm = "AES256"
      }
      bucket_key_enabled = true
    }
  }
  lifecycle_rule {
    id      = "default_lifecycle_rules"
    enabled = true
    transition {
      days          = 0
      storage_class = "INTELLIGENT_TIERING"
    }
    abort_incomplete_multipart_upload_days = 7
    expiration {
      expired_object_delete_marker = true
    }
    noncurrent_version_expiration {
      days = 14
    }
  }
  # tflint-ignore: aws_resource_missing_tags
  tags = merge(
    module.tags.squad_data_infrastructure_staging,
    {
      Squad = "core-data-platform"
    }
  )
}
# tflint-ignore: terraform_naming_convention
resource "aws_s3_bucket" "dq-development" {
  bucket        = "dq-development"
  force_destroy = false
  versioning {
    enabled    = false
    mfa_delete = false
  }
  server_side_encryption_configuration {
    rule {
      apply_server_side_encryption_by_default {
        sse_algorithm = "AES256"
      }
      bucket_key_enabled = true
    }
  }
  lifecycle_rule {
    id      = "default_lifecycle_rules"
    enabled = true
    transition {
      days          = 0
      storage_class = "INTELLIGENT_TIERING"
    }
    abort_incomplete_multipart_upload_days = 7
    expiration {
      expired_object_delete_marker = true
    }
    noncurrent_version_expiration {
      days = 14
    }
  }
  # tflint-ignore: aws_resource_missing_tags
  tags = merge(
    module.tags.squad_data_infrastructure_staging,
    {
      Squad = "core-data-platform"
    }
  )
}

There’s LOADS of duplication in here. There are plenty of strategies for fixing this (e.g. modules, for_each) however what I’m interested in is measuring this duplication. I would love there to be a CLI tool that I could point at a directory containing terraform code and it show me where duplicated code exists.

A tool my organisation uses frequently for measuring code duplication is sonarqube (they refer to it as Copy Paste Detection) but unfortunately it doesn’t support terraform. I’ve found a thread on their community forum: Duplicate Code Metric for Terraform Code missing - #10 by jamiekt - SonarQube Server / Community Build - Sonar Community asking for this to be rectified and a response there said:

So far we haven’t considered implementing Copy Paste Detection for Terraform. It can cause a lot of false positives that we would like to avoid.

(I also responded with an example of my own.)

My question therefore is … does anyone know of a tool/method for measuring code duplication across a terraform code base?

Bump. Any thoughts here?

You’ll need to create a variable file and bring the dynamic values into it. In your resource, you’ll have to implement some logic using count or for_each to iterate over those variable values.

Thanks for the reply. How does that help me measure code duplication?

I haven’t come across such a tool in my time using Terraform.

One thing I’ll say is that because Terraform is a DSL, there are definitely people who will argue that it’s often better to define things explicitly. And, even as someone who has written some pretty complicated Terraform code, I would also argue that there are definitely times with Terraform where DRY is not necessarily better?

That said, it’s an interesting idea… my guess is that the combination of it being somewhat niche, and somewhat difficult to implement is probably why there isn’t such a tool?