Handling long lived instances when there are short lived images

In both cloud IaaS and on premise, images are cycled/updated over time. From what I can tell, if the source image changes, Terraform will attempt to destroy and redeploy the server, which is simply not always an option. So taking the scenario

  • create “myapp” plan, which includes 3 servers, an a LB rule. Plan and Deploy
  • 6 months later “myapp” needs a new load balancer rule, so the plan is updated and tested
  • the original image used to deploy has been replaced, so now terraform either fails (if it was explicitly defined to a nonexistent imageID) or wants to rip/replace (found a matching name, but he ID doesn’t match) the three servers.

This seems … problematic to say the least. How do people handle this sort of thing? Is Terraform basically taking the stance of “nothing should be long lived”? Is there a way to tell Terraform to ignore certain aspects of infrastructure once deployed (hey man, Ansible has this now, just make sure it exists )or perhaps simply ignore certain attributes (yeah, the source image is gone, but you already deployed … so chill)?

And of course I eventually found the lifecycle {} stanza, which answers all my above questions.

Whelp … thanks for reading anyway :wink:

Ignore lifecycle for the moment. The server’s image may be defined in the template to be a value. Instead of a string value assignment, the image may be defined to be a named variable, which in turn is assigned a value. Now terraform plan will report No changes because the resources in the cloud and the instances in the state file are from the image ID in the template.

There is tremendous power in simple values.

ami = "ami-0120102bbbb11123f"

Alternatively, terraform’s template language is flexible enough to obtain the image by looking it up from a filter. Instead of the value for image assigned to a string, image is assigned the output of a data resource. In this case, if the filter returns a new image ID, terraform will report a plan to destroy and create to comply with the defined template.

The declarative intent of the template is used to configure the behavior to your liking.

I appreciate you confirming how the statefile determines change.

Unfortunately not every environment lets you select by name, so the AWS example doesn’t always pan out.

Best example I can give you would be vSphere. In this case the module cannot clone from a name, the module requires me to use the unique ID. So your alternative scenario becomes the only scenario: the name filter I run will always risk returning a new image ID as the images are rotated over time. So from what I can gather, using lifecycle to ignore changes to the “clone” block is the only way to ensure that instances aren’t accidentally deleted. As that block contains a bunch of run-once provisioning variables as well, seems like a natural fit.

In general the vSphere module is way less flexible than Azure/AWS… so maybe that’s the real problem.

Thank you again for your input.