What can cause terraform to "forget" that it's already managing a resource?

I’m deploying a fairly simple infrastructure. I’ll run terraform init, terraform plan, terraform apply, and for the first run, everything works fine.

When I add a subsequent resource, apply fails with similar errors as this:

│ Error: A resource with the ID "/subscriptions/578e0f86-0491-4137-9a4e-3a3c0ff28e91/resourceGroups/DEV-Lift_Stihl-Dev_CentralUS/providers/Microsoft.ContainerService/managedClusters/stihldevlift-cluster" already exists - to be managed via Terraform this resource needs to be imported into the State. Please see the resource documentation for "azurerm_kubernetes_cluster" for more information.

Terraform just created this resource in the run before. What’s causing it to forget that and to treat the resource like it already existed and needs to be imported?

Note: I am on Azure, and per security policy, we’re required to have on skip_provider_registration = true if that makes a difference

Hi @blue928,

Did Terraform’s proposed plan include a section “Note: Objects changed outside of Terraform” showing that the previously-created object had been “deleted”?

That is normally how Terraform would report that something it was previously tracking no longer seems to exist, which is the usual reason for there to be a plan to create something that ought to already exist.

Hello,

It’s strange, so let me backup and provide more context. I’m using Azure. The azure resource group that is used to house the resources is already created, and I don’t have permissions to edit or alter it, but I do have permissions to add to it. The same goes for the Azure Service Principal / Service connection that Terraform uses as the identity.

I create the resource group as part of my manifest definition (azurerm_resource_group.mygroup), and my workflow is to terraform init and then run terraform import <resource> <resource group ID> so that Terraform can continue to manage it.

From there I run, terraform validate, plan, apply, they all work fine. When I add a separate resource (one that has nothing to do with the azurerm_kubernetes_cluster, like a separate external database, plan shows the following:

Note: Objects have changed outside of Terraform

Terraform detected the following changes made outside of Terraform since the
last "terraform apply":

  # azurerm_resource_group.k8s has changed
  ~ resource "azurerm_resource_group" "k8s" {
        id       = "/subscriptions/578e0f86-0491-4137-9a4e-3a3c0ff28e91/resourceGroups/DEV-Lift_Stihl-Dev_CentralUS"
        name     = "DEV-Lift_Stihl-Dev_CentralUS"
      ~ tags     = {
          - "environment" = "stihldevlift" -> null
        }
        # (1 unchanged attribute hidden)

        # (1 unchanged block hidden)
    }

Notice that the tags is being set to null despite me having defined them in my manifest to not be null. It’s right after this I get the following.

Terraform will perform the following actions:

  # azurerm_kubernetes_cluster.k8s will be created
  + resource "azurerm_kubernetes_cluster" "k8s" {
      + dns_prefix                          = "stihldevliftrgk8s"
      + fqdn                                = (known after apply)
      + id                                  = (known after apply)
      + kube_admin_config                   = (known after apply)
...
...
... (etc)

It just built this cluster resource so why does it think it has to build it again? I think the tags are because I don’t have permissions to manipulate the resource group. I also think that this is contributing to the cluster issue. I can get around the tags error by adding tags = {}. But, by that time, Terraform already thinks it has to build a new one, but in apply already sees the resource and tells me I need to import it, per the errors above. So somehow it “forgot” that it just created it. How can I debug that’s happening there?

I have a couple contributing factors I’m trying to sniff out here. Terraform warns that by registering providers manually we could get hard to decipher errors. Maybe I’m missing a Provider I need but am not aware of? Maybe it’s related to a permissions issue where not being able to successfully manage an imported resource causes Terraform to ‘malfunction’ in a way that its state is confused?

Thanks for any direction!

Hi @blue928,

The first output you shared is reporting that something other than terraform apply for this configuration seems to have deleted the environment tag from that resource group. That seems to suggest that something other than your current Terraform configuration is already managing that object, in which case it would be better for your Terraform configuration to treat it as an external dependency rather than a directly-managed object, which you can achieve using the azurerm_resource_group data source:

data "azurerm_resource_group" "k8s" {
  name = "DEV-Lift_Stihl-Dev_CentralUS"
}

Elsewhere in your configuration you can then refer to data.azurerm_resource_group.k8s instead of azurerm_resource_group.k8s in order to access the id attribute or any other resource group attributes you need.

Since you’ve already imported the object as a managed resource, if you switch to a data resource now you’ll need to tell Terraform to “forget” the existing binding, which is essentially the opposite of the terraform import you did earlier, so that Terraform won’t think you intend to destroy this object:

terraform state rm azurerm_resource_group.k8s

Terraform assumes that any object bound to a managed resource instance is being exclusively managed by that particular Terraform configuration, so when you use terraform import you need to be careful to only import objects that you are responsible for managing. If it’s just an object you are using, managed elsewhere, then a data resource is the correct way to declare that, so that Terraform will not try to change the settings of the object.


With that all said, it isn’t clear to me that this resource group situation is related to the situation with the Kubernetes cluster. If Terraform isn’t showing that the object was deleted in the “Objects have changed outside of Terraform” section then that suggests something very strange is going on; Terraform should always report if it found something missing during the refresh/plan phase, in the same way that it reported the environment tag was missing from the resource group.

I’m not sure what to suggest next, but maybe if you can share the entire plan output (rather than just the snippets you already shared) the rest of it will give me some more information I’m not thinking of yet.

I’m not sure what you mean by the warning about “registering providers manually”, so it would also help if you could include the exact output of that warning so I can see what exactly it’s referring to.

I found this Hashicorp example which explains a workflow for working with existing Azure AKS kubernetes clusters. If I follow the workflow of first using a targeted apply as they suggest, I never get the error; if I don’t, then the error creeps up. It’s an interesting workflow that requires a data resource to be used on the just-created cluster to avoid hard to decipher errors.

I’m refactoring to use as little targeted apply as possible, but at least for now I think we can put a pin in this since that link provides a good way to stay or get back on track if things get wonky. Thanks!