Terraform State issue for failed Azure VM

Azure Provider version: 2.33.0
Terraform version: 0.13.5

  • We are trying to create a Virtual machine on Azure along with a resource group, vnet, subnet and network interfaces using Terraform. What we’ve observed is that sometimes the creation of the VM fails and the VM goes into a ‘Failed’ state even though it gets provisioned and has network interfaces etc. attached to it.

  • The issue is that after the provisioning fails, the Terraform state file contains the details of all other provisioned resources but does not have the definition for the VM. So, when we try to clean up the resources using Terraform, the cleanup also fails while deleting the NIC since the NIC is in use by the VM (failed VM). The only way for proper clean-up is then to go to the Azure console and delete the VM manually.

  • We’ve found a similar issues mentioned here:
    Cannot destroy cleanly when VM provisioning ends up in failed state · Issue #98 · hashicorp/terraform-provider-azurerm · GitHub
    Failed Azure VM doesn't update Terraform state · Issue #7236 · hashicorp/terraform-provider-azurerm · GitHub

  • Is there any way to handle this scenario more gracefully where manual intervention is not required and we can atleast cleanup all the resources (including the failed VM) through Terraform itself.