Replace strategy

I have something a lot like

resource "openstack_compute_instance_v2" "basic" {
 count = 5
  name            = "basic"
  image_id        = "ad091b52-742f-469e-8f3c-fd81cadf0743"
  flavor_id       = "3"
  key_pair        = "my_key_pair_name"
  security_groups = ["default"]

  metadata = {
    this = "that"
  }

  network {
    name = "my_network"
  }
}

if I change the image obviously it all deletes and new ones create.

I want to force it to delete one/make one THEN delete next/make next and so on as a rolling update.

Is this possible?

Hi @HelenCousins,

Terraform alone doesn’t include built-in facilities for that sort of gradual migration. Typically we rely on features of the underlying platform to achieve this effect, such as in AWS using autoscaling groups so that the autoscaling service can be responsible for swapping the individual images and Terraform is just the one commanding it to do so.

I don’t know if OpenStack has a comparable mechanism, though. If not, it is possible in principle to build a workflow that can achieve it with Terraform, but it will come in the form of automation you build around Terraform rather than something built in to Terraform itself.

For example, you can use input variables to specify both the old and new images and a number representing the “rollout threshold” for which subset of your compute instances will use the new image:

variable "image_id" {
  type = string
}

variable "new_image_threshold" {
  type    = number
  default = 0
}

variable "new_image_id" {
  type    = string
  default = null
}

resource "openstack_compute_instance_v2" "basic" {
  count = 5

  image_id = count.index < var.new_image_threshold ? var.new_image_id : var.image_id
  # ...

With these arguments in place you can run Terraform multiple times to gradually roll out your new image:

terraform apply -var="image_id=OLD"
terraform apply -var="image_id=OLD" -var="new_image_id=NEW" -var="new_image_threshold=1"
terraform apply -var="image_id=OLD" -var="new_image_id=NEW" -var="new_image_threshold=2"
terraform apply -var="image_id=OLD" -var="new_image_id=NEW" -var="new_image_threshold=3"
terraform apply -var="image_id=OLD" -var="new_image_id=NEW" -var="new_image_threshold=4"
terraform apply -var="image_id=NEW"

At each step, Terraform should notice that one more of the instances now uses the new image id NEW and propose to replace it. At the last step, we pivot to using NEW as the “main” image ID, leaving new_image_id and new_image_threshold unset again, and thus reach the new stable state of using only that one new image.

This is just one example of a rollout strategy. Another common one is to write a separate resource for each image version and temporarily have twice as many images as normal – five of the old and five of the new – and then destroy the old ones once the new ones are all initialized. That’s an approach that requires fewer intermediate steps, but obviously can only work if your system design can tolerate there temporarily being “too many” instances during a rollout.

oh, I like the 10 servers option; but alas, available resource is 5, so it’s delete/create. That threshold thing I had not thought of.