We have GKE’s automatic cluster upgrades enabled as a security policy.
We also define pretty much all of our Kubernetes API objects as Terraform resources.
The issue we’ve run into a few times now (this year) is that an API will get deprecated and we have to change the resource address to match the newly upgraded version, then terraform import.
For example:
k8s ingress version v1beta1 gets deprecated, but GKE automatically upgrades all ingresses to v1. No downtime.
Terraform’s ingress resource is pinned to v1beta1, so when I do a terraform plan, it determines that ALL 15 of my ingresses no longer exist and need to be recreated.
So my resolution:
Update my resource definition in the module I have 15 instances of to use ingress_v1 instead of ingress
Update the contents of that resource definition to match the new syntax
terraform import the newly upgraded K8s ingresses in the cluster to match their resources in TF state. 15 times.
This is pretty painful. Anyone have a nicer alternative? If you script this out, how do you go about it?
I have not had to do this - but I have thought through these consequences, and decided not to use terraform-provider-kubernetes as a result.
Helm is not without its own pitfalls, but to me feels like a better fit for programming the Kubernetes API. Though, AFAIK it doesn’t offer drift detection, so if that’s part of what you needed from Terraform.
In the event I had to deal with an existing ecosystem that had invested in terraform-provider-kubernetes, I’d would probably:
Accept that these parts had to be done mostly semi-manually:
although some search and replace supporting regular expressions may be able to ease the burden.
But then, for updating the Terraform state, I’d try to automate it.
The terraform-docs.io tool has a JSON output mode, which I have in the past found useful for introspecting Terraform configurations, so that I can find out what blocks exist in a Python script, without needing to confront HCL parsing.
With that as a building block, I’d be able to read the Terraform configuration before and after the update, and compute pairs of source and destination resource addresses. With those, I’d be in a position to either automate generation of suitable terraform state rm and terraform import commands - or perhaps the new import blocks.
If the schema of the Terraform resource hadn’t changed much, I might be tempted to skip the remove/reimport, and attempt to figure out how to directly manipulate the Terraform state JSON to transition to the new resource version.
Of course, all this would be a short term measure.
Longer term, I really don’t see why terraform-provider-kubernetes couldn’t offer resources that look like this:
Then, upon the version changing, you could just edit the v1beta1 to v1, and make whatever minimal changes are required to the actual configuration, if any.
The resource type would support as many different version blocks as whichever version of Kubernetes it was targeting did. It would require and ensure the user could only specify one at a time. Since each block would have a different Terraform schema definition, the provider would be free to support whatever schema changes Kubernetes required of it. Since moving from one version to another would not change the Terraform resource address, the provider could smoothly support version changes without pain.
If I was in a position of being likely to use terraform-provider-kubernetes, I’d advocate for changing it to operate in this way.