RDS Global Cluster (Aurora Serverless) Falls Apart When ACU Updates In Terraform

I have built a built a 2 region Global Cluster (Postgres 14.17) that uses Aurora Serverless regional clusters using Terraform (Providor=6.37.0).

As part of PoC testing, I am able to update the ACU settings via console and AWS CLI command line without issue. My problem occurs when I try to update the primary node’s ACU settings via Terraform, the whole cluster falls a part into the two underlying regional clusters. I don’t know what’s causing this but it only seems to occur when working with Terraform - have I missed something in the documentation, done something dumb (probably) or is this a bug?

Postgres: 14.17

Terraform Provider: registry.terraform.io/hashicorp/aws, 6.37.00

Code for the Regional Cluster:

resource "aws_rds_cluster" "poc_global_database" {
  engine                       = var.engine
  engine_version               = var.engine_version
  engine_mode                  = var.engine_mode
  cluster_identifier           = "regional-${data.aws_region.current.id}-cluster"
  db_cluster_parameter_group_name  = var.db_cluster_parameter_group_name
  storage_encrypted            = true
  kms_key_id = data.aws_kms_key.default_kms_key.arn

  database_name     = var.database_name != null ? var.database_name : null
  master_username   = var.master_username != null ? var.master_username : null
  master_password   = var.master_password != null ? var.master_password : null

  global_cluster_identifier = var.global_cluster_identifier  != null ? var.global_cluster_identifier : null

  db_subnet_group_name         = aws_db_subnet_group.poc_global_database.name
  availability_zones           = data.aws_availability_zones.this.names
  snapshot_identifier          = var.snapshot_update ? var.snapshot_identifier : null

  vpc_security_group_ids = [aws_security_group.poc_global_database.id]

  skip_final_snapshot         = true
  deletion_protection         = false
  backup_retention_period     = 1
  apply_immediately           = true
  allow_major_version_upgrade = false

  delete_automated_backups  = false

  serverlessv2_scaling_configuration {
    min_capacity = var.min_capacity
    max_capacity = var.max_capacity
  }
  tags = {
    backup = "true"
  }
}

After further testing, changing the ACU settings on either cluster (primary or secondary) causes Terraform to pull the cluster apart. Here’s the plan for one of the changes:

Terraform will perform the following actions:

  # module.eu-west-1_Cluster.aws_rds_cluster.poc_global_database will be updated in-place
  ~ resource "aws_rds_cluster" "poc_global_database" {
      - global_cluster_identifier             = "global-cluster" -> null
        id                                    = "regional-eu-west-1-cluster"
        tags                                  = {
            "backup" = "true"
        }
        # (50 unchanged attributes hidden)

      ~ serverlessv2_scaling_configuration {
          ~ min_capacity             = 2 -> 0.5
            # (2 unchanged attributes hidden)
        }
    }

  # module.eu-west-2_Cluster.aws_rds_cluster.poc_global_database will be updated in-place
  ~ resource "aws_rds_cluster" "poc_global_database" {
        id                                    = "regional-eu-west-2-cluster"
      - replication_source_identifier         = "arn:aws:rds:eu-west-1:2xxx3:cluster:regional-eu-west-1-cluster" -> null
        tags                                  = {
            "backup" = "true"
        }
        # (49 unchanged attributes hidden)

        # (1 unchanged block hidden)
    }

Plan: 0 to add, 2 to change, 0 to destroy.

Note the global_cluster_identifier and replication_source_identifier get set to null

Thanks.

1 Like

Update: further testing

I’ve spent time doing further testing and the issue seems to be a bit deeper than just ACU changes. I decided to see what happens when any change occurs and added an S3 bucket to the Terraform code. Even though the change was only adding a bucket the terraform plan shows that it wants to change the global_cluster_identifier and replication_source_identifier values to null.

...
  # module.eu-west-1_Cluster.aws_rds_cluster.poc_global_database will be updated in-place
  ~ resource "aws_rds_cluster" "poc_global_database" {
      - global_cluster_identifier             = "global-cluster" -> null
        id                                    = "regional-eu-west-1-cluster"
        tags                                  = {
            "backup" = "true"
        }
        # (50 unchanged attributes hidden)

        # (1 unchanged block hidden)
    }

  # module.eu-west-2_Cluster.aws_rds_cluster.poc_global_database will be updated in-place
  ~ resource "aws_rds_cluster" "poc_global_database" {
        id                                    = "regional-eu-west-2-cluster"
      - replication_source_identifier         = "arn:aws:rds:eu-west-1:2xxxx3:cluster:regional-eu-west-1-cluster" -> null
        tags                                  = {
            "backup" = "true"
        }
        # (49 unchanged attributes hidden)

        # (1 unchanged block hidden)
    }

Plan: 1 to add, 2 to change, 0 to destroy.

Even after I remove the S3 bucket resource from the Terraform, it still wants to add the changes.

I have tried to use a data element to set the global_cluster_identifier value but that also does not work.

Fixed…

I have now managed to fix the issue by adding a Terraform lifecycle to the code in the aws_rds_cluster definition:

resource "aws_rds_cluster" "poc_global_database" {

...

  lifecycle {
    ignore_changes = [
      global_cluster_identifier,
      replication_source_identifier
    ]
  }
}

When Terraform plan is run, with no code changes, I get the Your infrastructure matches the configuration. message.

When the max_capacity parameter is changed, the code only wants to update that and not nullify the global cluster:

  # module.eu-west-1_Cluster.aws_rds_cluster.poc_global_database will be updated in-place
  ~ resource "aws_rds_cluster" "poc_global_database" {
        id                                    = "regional-eu-west-1-cluster"
        tags                                  = {
            "backup" = "true"
        }
        # (50 unchanged attributes hidden)

      ~ serverlessv2_scaling_configuration {
          ~ max_capacity             = 6 -> 4
            # (2 unchanged attributes hidden)
        }
    }

Plan: 0 to add, 1 to change, 0 to destroy
1 Like