GCP issue: when renaming instance groups, Terraform fails to remove VMs properly

I have a strange problem with Terraform (1.0.8) and GCP provider. Consider the below instance group template used:

module "sample_container_version1" {
  source  = "terraform-google-modules/container-vm/google"
  version = "~> 2.0"

  container = {
    image = "us-central1-docker.pkg.dev/project/services/sample:version1"

module "sample_template_version1" {
  source  = "terraform-google-modules/vm/google//modules/instance_template"
  version = "~> 6.0"

  project_id = var.gcp_project
  name_prefix = "sample-template-version1"
  network = "default"
  source_image_family  = "cos-stable"
  source_image_project = "cos-cloud"

  source_image = reverse(split("/", module.sample_container_version1.source_image))[0]

module "sample_instance_group" {
  source  = "terraform-google-modules/vm/google//modules/mig"
  version = "~> 6.0"

  project_id = var.gcp_project
  region     = var.gcp_region

  mig_name = "sample-instance-group"

  instance_template = module.sample_template_version1.self_link
  network = "default" #google_compute_network.default.self_link

resource "google_compute_instance_group_manager" "sample_mig" {
  provider = google-beta
  name     = "sample-mig"

  project = var.gcp_project

  base_instance_name = "sample_instance"
  zone               = "us-central1-a"

  version {
    instance_template = "https://www.googleapis.com/compute/v1/projects/project/global/instanceTemplates/sample-parser-instance-template-version1"
  target_size = 1

When I do init/plan/apply, it creates the expected instance using “_version1” template. All is fine.

Now I replace “version1” with “version2”, save changes and do init/plan/apply again.

Expected behavior: existing MIG is removed, with its template and VM; new MIG is created with VM instantiated from new template.

Actual behavior:

  • MIG is renamed (expected)
  • VM in it remains, it is still derived from “version1” template (unexpected)
  • version1 template is removed, version2 template is created (expected)
  • MIG was depending on “version1” template, now it’s depending on both “version1” and “version2” templates (unexpected)

If I stop/delete the VM within the MIG first, then apply the changes to .tf, the modification is performed as expected (“version1” everywhere is replaced with “version2”, VM is replaced).

Is it possible to do the mentioned changes with Terraform only?

I think you need to change below block in google_compute_instance_group_manager. sample_mig to:

version {
instance_template = module. sample_template_version1.self_link

Sorry, that doesn’t eliminate the issue.

When I do as explained above, using “version2” names, final problem remains as stated in the original post.

Maybe I haven’t explained clearly.

when yo uuse the string value, it will just be seen as a string and Terraform won’t reconginize the dependency between sample_mig and sample_template_version1, but if you use instance_template = module. sample_template_version1.self_link, then terraform will recongnize the dependency, when you rename sample_template_version1 to sample_template_version2, it will be recreated , which means all resource dependent on sample_template_version1 will be removed…

Woud you mind giving it a try?

I did, @ausmartway . In fact, the only means to force TF to re-create the resources is to assign different name to “sample_mig”. E.g., if I change it to “sample_mig_version1” etc, only then TF will actually re-create VM and properly assign new template to instance group manager.

Otherwise, behavior remains the same, issue persists. By design, we would need to keep MIG name the same, but looks like TF doesn’t work this way.

Ah, understand. have you considered terraform taint command?

this will force recreation of resource on the next terraform apply, without having to rename it.

@ausmartway ‘taint’ is deprecated, FWIK. Anyway, neither ‘taint’ nor using ‘-replace’ parameter for “apply” do not help. For unknown reasons, the MIG and its VM survive the “apply”.

I have either to rename MIG as well, or shut down VM from outside TF. In that case, resources are correctly re-created.

i am guest guessing here, that maybe the address prameter you used with -replace is not correct?

I would use terraform show to find the right resource address, then use -replace… it shouldn’t survive…

If it still survives, you can raise an issue here, attatching full debug log.

@ausmartway Alas, it survives. Yeah, I’ll post a bug report ASAP.