Terraform removed the resources, which were not in the Terraform Plan

Good day,

I would like to discuss an issue, happened with me 7 day ago, during deployment to azure.

We are trying to stick IaC, which means that all changes must go via CI/CD. We also have some resources, which were deployed not via Terraform, like Storage Account (we need it to save terraform state) and etc.

7 days ago our team tried to apply new portion of changes to already existing infrastructure. Before to approve “apply” we always verifying the Terrafrom Plan, that time we verified it and did not find something strange, Terraform Plan looked good. After applying new changes to already existing infra, the infra was removed, including resources are not part of Terraform state.

I dont know whether somebody else faced with such issues before and could give us some advice where to look to find the issue. We have contacted to Microsoft to find the reason, but unfortunately we did not find.

Have a good day and best regards

Hi @andrii.kondratenko! Can you elaborate or recall what happened in that apply? Activity log in the Azure Subscription should show exactly what happened and who/what triggered that.

It is possible in case of a resource group deletion which didn’t catch your attention. If a resource group is removed in the current version of the azurerm provider, all containing resources are also removed. In the upcoming 3.0 version you’ll not be able to delete a resource group which contains resources by default, but this release is not scheduled yet (at least not publicly).

This behaviour can already be enabled for version 2.72+ of azurerm by using this provider feature flag, prevent_deletion_if_contains_resources.

Provider configuration would look like this:

provider "azurerm" {
  features {
    // Other feature configuration?
    ...
    resource_group {
      prevent_deletion_if_contains_resources = true
    }
  }
}

Hi @aristosvo. Thank you for your reply.

We found all deletions in Activity Logs, RG itself was not removed. We also found who triggered deletion and exact time when it was run. Deletion was triggered by our SP, which we are using to apply changes via CI/CD. Time was equal to time of Azure DevOps release.

Here is the terraform applying plan:

[1mPlan: 9 to add, 16 to change, 7 to destroy.
[1mmodule.cluster.azurerm_role_assignment.mi_contributor_role_general: Destroying... [id=/subscriptions/[subscriptionId]/resourceGroups/[resourceGroupName]/providers/Microsoft.Authorization/roleAssignments/[objectId]]
[1mmodule.cluster.azurerm_subnet_route_table_association.subnet_route_table_association: Destroying... [id=/subscriptions/subscriptionId/resourceGroups/resourceGroupName/providers/Microsoft.Network/virtualNetworks/[VNetName]/subnets/[name]]
[1mmodule.cluster.azurerm_role_assignment.read_role: Destroying... [id=/subscriptions/subscriptionId/resourceGroups/resourceGroupName/providers/Microsoft.KeyVault/vaults/[name]/providers/Microsoft.Authorization/roleAssignments/[objectId]]
[1mmodule.cluster.azurerm_role_assignment.mi_contributor_role_kubelet_general: Destroying... [id=/subscriptions/subscriptionId/resourceGroups/resourceGroupName/providers/Microsoft.Authorization/roleAssignments/[objectId]]
[1mmodule.cluster.azurerm_subnet_network_security_group_association.subnet_nsg_association: Destroying... [id=/subscriptions/subscriptionId/resourceGroups/resourceGroupName/providers/Microsoft.Network/virtualNetworks/[VNetName]/subnets/[name]]
[1mmodule.cluster.azurerm_key_vault_access_policy.key_vault_policy_pod: Destroying... [id=/subscriptions/subscriptionId/resourceGroups/resourceGroupName/providers/Microsoft.KeyVault/vaults/[name]/objectId/[objectId]]
[1mmodule.logic_app.azurerm_resource_group_template_deployment.connection: Creating...
[1mmodule.webhook_server.azurerm_app_service_plan.webhook_server_plan: Modifying... [id=/subscriptions/subscriptionId/resourceGroups/resourceGroupName/providers/Microsoft.Web/serverfarms/[name]]
[1mmodule.vm_linux.azurerm_network_interface.vm_network_interface: Modifying... [id=/subscriptions/subscriptionId/resourceGroups/resourceGroupName/providers/Microsoft.Network/networkInterfaces/[name]]
[1mmodule.sql_managed_instance.azurerm_network_security_group.mi_security_group: Modifying... [id=/subscriptions/subscriptionId/resourceGroups/resourceGroupName/providers/Microsoft.Network/networkSecurityGroups/[name]]
[1mmodule.vm_linux.azurerm_network_interface.vm_network_interface: Modifications complete after 1s [id=/subscriptions/subscriptionId/resourceGroups/resourceGroupName/providers/Microsoft.Network/networkInterfaces/[name]]
[1mmodule.vm_linux.azurerm_linux_virtual_machine.vm_linux: Modifying... [id=/subscriptions/subscriptionId/resourceGroups/resourceGroupName/providers/Microsoft.Compute/virtualMachines/[name]]
[1mmodule.cluster.azurerm_role_assignment.mi_contributor_role_kubelet_general: Destruction complete after 1s
[1mmodule.sql_managed_instance.azurerm_sql_managed_instance.managedsqlinstance: Modifying... [id=/subscriptions/subscriptionId/resourceGroups/resourceGroupName/providers/Microsoft.Sql/managedInstances/[name]]
[1mmodule.cluster.azurerm_role_assignment.read_role: Destruction complete after 1s
[1mmodule.sql_managed_instance.azurerm_route_table.mi_routetable: Modifying... [id=/subscriptions/subscriptionId/resourceGroups/resourceGroupName/providers/Microsoft.Network/routeTables/[name]]
[1mmodule.cluster.azurerm_role_assignment.mi_contributor_role_general: Destruction complete after 1s
[1mmodule.sql_managed_instance.azurerm_network_security_group.mi_security_group: Modifications complete after 1s [id=/subscriptions/subscriptionId/resourceGroups/resourceGroupName/providers/Microsoft.Network/networkSecurityGroups/[name]]
[1mmodule.sql_managed_instance.azurerm_route_table.mi_routetable: Modifications complete after 1s [id=/subscriptions/subscriptionId/resourceGroups/resourceGroupName/providers/Microsoft.Network/routeTables/[name]]
[1mmodule.vm_linux.azurerm_linux_virtual_machine.vm_linux: Modifications complete after 1s [id=/subscriptions/subscriptionId/resourceGroups/resourceGroupName/providers/Microsoft.Compute/virtualMachines/[name]]
[1mmodule.cluster.azurerm_subnet_network_security_group_association.subnet_nsg_association: Destruction complete after 4s
[1mmodule.cluster.azurerm_role_assignment.contributor_role: Destroying... [id=/subscriptions/subscriptionId/resourceGroups/resourceGroupName/providers/Microsoft.Authorization/roleAssignments/[objectId]]
[1mmodule.webhook_server.azurerm_app_service_plan.webhook_server_plan: Modifications complete after 4s [id=/subscriptions/subscriptionId/resourceGroups/resourceGroupName/providers/Microsoft.Web/serverfarms/[name]]
[1mmodule.webhook_server.azurerm_app_service.webhook_server: Modifying... [id=/subscriptions/subscriptionId/resourceGroups/resourceGroupName/providers/Microsoft.Web/sites/[name]]

Hi @andrii.kondratenko! Although having minimal information, one thing catches my attention:

  • module.logic_app.azurerm_resource_group_template_deployment.connection looks interesting to me. This deployment of a template can have caused the destruction of resources in the same resource group as the template is deployed when the wrong settings are used ( deployment_mode = "Complete"). Can you share the Terraform configuration of this template, or at least the deployment_mode?

Hi @aristosvo,

You are right, we had deployment_mode = “Complete”. But why there was not information about destroying in the Terraform Plan?

azurerm cannot show the changes made within an ARM template deployment, at least not at the moment. This would require a big change in the resource, which is only used as a stopgap to enable functionality which cannot be deployed using the normal azurerm resources.

This is something that is not that easily fixed, it’s unfortunate you had to face this problem! The documentation state its working clearly, but not clear enough that this is not shown in the plan apparently :frowning:

If deployment_mode is set to Complete then resources within this Resource Group which are not defined in the ARM Template will be deleted.

1 Like

HI @aristosvo ,

thank you for your investigation. Maybe you know how it is working under the hood?

The thing is Terraform did not delete all resources, it deleted 64 of 68 resources and 62 of 64 deleted resources were in Terraform state file. Behavior itself unclear. How Terraform knows what to delete?

For instance, AKS was removed, but route table not and we did not find a removing request for route table in the logs.

The way we are creating resources is azu cli + terraform apply.

BR
A

Hi @andrii.kondratenko!

It works just like any deployment of ARM templates. The delete based on the ARM template and the actions of Terraform on other resources might have interfered, which can explain the resources still available or created after the remove based on the ARM template.

For a more detailed approach I’d need to have access to your subscription, I don’t think that is an option :smiley:

Hope the recovery went well and it never happens again :crossed_fingers:

Hi @aristosvo ,

Yes, we recovered system. Yeap access to subscription it is not possible, but information provided by you is enough already to think about.

Thank you very much for your help.

BR
A

1 Like