Terraform apply and order of orphan modules deletion

Hi,
We are using Terraform 1.7.5 with local private providers and modules.
We first apply the following configurations:

...
  provider "cis" {
      region = local.automation_context.subaccount_region
      alias  = "as-ga-admin"

      # Set credentials from the cloud-automation-client secret
      cis_central_client_id     = local.automation_context.client_id
      cis_central_client_secret = local.automation_context.client_secret
      cis_central_oauth_url     = local.automation_context.oauth_url
      cis_central_domain        = local.automation_context.domain
    }

    provider "cis" {
      alias = "as-sa-admin"
      region = local.automation_context.subaccount_region
      cis_local_credentials = module.subaccount.cis_local_credentials
    }

    provider "sm" {
      region = local.subaccount_region

      client_id     = module.subaccount.cis_service_management_binding.client_id    
      client_secret = module.subaccount.cis_service_management_binding.client_secret
      url           = module.subaccount.cis_service_management_binding.url          
      sm_url        = module.subaccount.cis_service_management_binding.sm_url       
    }

    provider "xsuaa" {
      region                   = local.automation_context.subaccount_region
      domain                   = local.cis_local_domain
        
      cis_local_client_id      = local.cis_local_binding.clientid
      cis_local_client_secret  = local.cis_local_binding.clientsecret
      cis_local_oauth_url      = local.cis_local_binding.url
    }

    module "subaccount" {
      automation_context = "${local.automation_context}"
      customer_id = "${var.tenantContext.details.customer.id}"
      source = "http://api.cloud-automation-registry/modules/sap-managed-cis-subaccount"
      subaccount_admins = "${local.subaccount_admins}"
      subaccount_display_name = "${local.subaccount_display_name}"
    }
    
    module "assignment-1" {
      source = "http://api.cloud-automation-registry/modules/sap-managed-entitlements"
      providers = {
        cis = cis.as-ga-admin
      }
      service_name = "auditlog-viewer"
      service_plan_name = "free"
      subaccount_guid = "${module.subaccount.subaccount_guid}"
      automation_context = "${local.automation_context}"
    }
    
    module "subscription-auditlog-viewer" {
      automation_context = "${local.automation_context}"
      providers = {
        cis = cis.as-sa-admin
      }
      saas_app_name = "auditlog-viewer"
      saas_plan_name = "free"
      source = "http://api.cloud-automation-registry/modules/sap-managed-subscription"
      subaccount_credentials = "${local.subaccount_credentials}"
      depends_on = [module.assignment-1]
    }

As you can see there is an explicit dependency between the subscription-auditlog-viewer module and the assignment-1 module.

On first apply the order is correct. Terraform creates the subsciprion after the assignment.

We are using S3 backend for managing the state. and i can see that in the state file the dependency was persisted successfully.

From the state file:

"resources": [
    {
      "module": "module.assignment-1",
      "mode": "managed",
      "type": "cis_entitlements",
      "name": "entitlements",
      "provider": "provider[\"api.cloud-automation-registry/providers/cis\"].as-ga-admin",
      "instances": [
        {
          "index_key": 0,
          "schema_version": 0,
          "attributes": {
            ...
          "dependencies": [
            "module.subaccount.cis_subaccount.cis_subaccount"
          ]
        }
      ]
    },
   ....
{
      "module": "module.subscription-auditlog-viewer",
      "mode": "managed",
      "type": "cis_saas_subscription",
      "name": "saas_subscription",
      "provider": "provider[\"api.cloud-automation-registry/providers/cis\"].as-sa-admin",
      "instances": [
        {
        ...
          "dependencies": [
            "module.assignment-1.cis_entitlements.entitlements",
            ...
          ]
        }
      ]

After that we would like to remove both the subscription and the assignment and re apply the configurations.

...
  provider "cis" {
      region = local.automation_context.subaccount_region
      alias  = "as-ga-admin"

      # Set credentials from the cloud-automation-client secret
      cis_central_client_id     = local.automation_context.client_id
      cis_central_client_secret = local.automation_context.client_secret
      cis_central_oauth_url     = local.automation_context.oauth_url
      cis_central_domain        = local.automation_context.domain
    }

    provider "cis" {
      alias = "as-sa-admin"
      region = local.automation_context.subaccount_region
      cis_local_credentials = module.subaccount.cis_local_credentials
    }

    provider "sm" {
      region = local.subaccount_region

      client_id     = module.subaccount.cis_service_management_binding.client_id    
      client_secret = module.subaccount.cis_service_management_binding.client_secret
      url           = module.subaccount.cis_service_management_binding.url          
      sm_url        = module.subaccount.cis_service_management_binding.sm_url       
    }

    provider "xsuaa" {
      region                   = local.automation_context.subaccount_region
      domain                   = local.cis_local_domain
        
      cis_local_client_id      = local.cis_local_binding.clientid
      cis_local_client_secret  = local.cis_local_binding.clientsecret
      cis_local_oauth_url      = local.cis_local_binding.url
    }

    module "subaccount" {
      automation_context = "${local.automation_context}"
      customer_id = "${var.tenantContext.details.customer.id}"
      source = "http://api.cloud-automation-registry/modules/sap-managed-cis-subaccount"
      subaccount_admins = "${local.subaccount_admins}"
      subaccount_display_name = "${local.subaccount_display_name}"
    }

Now the graph looks like:

And as can be seen from the graph there is missing edge in the dependencies. It was expected to delete the assignment and the subscription in the reverse order, means first the subscription and then the assignment. But it actually first trying to delete the assignment and then the subscription and therefore it fails.

It seems to me like a bug in the behavior but i would like to understand if i’m missing something before opening a bug.

Thanks,
Nimrod Oron

Hi @nimrodoron,

Delete operations must happen in the reverse order of create operations. So since you have created the resources in the order

module.subaccount.cis_subaccount.cis_subaccount
module.assignment-1.cis_entitlements.entitlements

the destroy order is going to be

module.assignment-1.cis_entitlements.entitlements
module.subaccount.cis_subaccount.cis_subaccount

Given the create order of A then B, Terraform must assume that if B depends on A for creation, it continues to depend on A for it’s entire lifespan – so A cannot be deleted before B.

Hi
@jbardin, indeed this is the order i was expected. But please look again in my post. I didn’t talk about the module.subaccount. This was not deleted at all. I removed the module.assignment-1 and module.subscription-auditlog-viewer. There was the wrong order when the assignment got deleted before the subscription but was supposed to be deleted after it.

Regards,
Nimrod Oron

Sorry, I grabbed the wrong resource when the rest was scrolled off-screen in the text box, but the general order still stands. If that’s the order you were expecting, then I’m not sure what the problem is here. Are you referring solely to the graph output, or was there an actual problem you encountered during apply?

I was expecting the order on deletion:
module.subscription-auditlog-viewer → module.assignment-1

(The order on creation was: module.assignment-1 → module.subscription-auditlog-viewer)

But actually on deletion module.assignment-1 was trying to be deleted before module.subscription-auditlog-viewer and therfore there was an error.

The graph in my opinion shows the problem in Terraform when planning the execution. There was a missing edge between those resources that was exist in the first apply… But maybe it just the graph, i don’t know. But the actual problem was the wrong order on deletion that is not aligned what i was expected and you approved to be the right one.

Regards,
Nimrod Oron

It would help to have the actual log output from apply when there was an error, or a way to reproduce the error. The graph output hides a lot of very implementation-specific context within it, doesn’t show all internal details, and requires more understanding of the internals than we would like. Modules themselves are only a container for configuration, and have no real order of operation – in fact dependencies can just as easily go back and forth between module, so we have to look only at the instances involved.

Given the stored dependencies shown in the state file, when it comes time for the actual destroy operations, terraform is going to follow those in reverse order. When this fails, in many cases it’s simply a matter of eventual consistency in the remote system, when the second destroy operation does not yet know the earlier destroy happened.

Here are the logs (First there is planning there, then the graph and finally the actual apply)
log.txt (77.3 KB)

Log with trace enabled:

log-full.txt (2.1 MB)

Regards,
Nimrod Oron

Hi @nimrodoron,

From what I can see in the logs, I don’t think anything was applied at all! The terraform apply command wasn’t acting on a plan, and it appears to have failed during the planning process, and repeats the same error from earlier in the logs.

The error I can extract from the logs happens when reading the resource during the plan, it’s a 24kb error message but starts with failed to assign entitlements to subaccount.

As far as how to interpret that error, I’m not familiar enough with this provider or resources to know what it might mean. The verbs in the error message imply it was trying to complete some action of “assigning entitlements”, and if that means making changes to the resources then the provider was making changes when it shouldn’t have.

The error failed to assign entitlements to subaccount is from the deletion of the module.assignment-1. The error occurs because there is still an active subscription (The module.subscription-auditlog-viewer). I agree, this is internal information of the provider which you don’t know… But the issue is that Terraform trying to delete the module.assignment-1 before it deletes the module.subscription-auditlog-viewer.

From the logs you can see:

...
# module.assignment-1.cis_entitlements.entitlements will be destroyed
  # (because cis_entitlements.entitlements is not in configuration)
  - resource "cis_entitlements" "entitlements" {
      - global_account_guid      = "b9eb34fa-1a21-424b-8ec3-8f91c8553c8d" -> null
      - global_account_subdomain = "b9eb34fa-1a21-424b-8ec3-8f91c8553c8d" -> null
      - id                       = "1b4cac92-4bfa-424c-a093-78d281c1042b" -> null
      - job_instance_id          = "7910303" -> null
      - status                   = "OK" -> null
      - subaccount_guid          = "1b4cac92-4bfa-424c-a093-78d281c1042b" -> null

      - entitlements {
          - amount            = 0 -> null
          - enable            = true -> null
          - service_name      = "auditlog-viewer" -> null
          - service_plan_name = "free" -> null
          - status            = "ASSIGNED" -> null
        }
    }
...
  # module.subscription-auditlog-viewer.cis_saas_subscription.saas_subscription will be destroyed
  # (because cis_saas_subscription.saas_subscription is not in configuration)
  - resource "cis_saas_subscription" "saas_subscription" {
      - additional_output           = {} -> null
      - app_id                      = "auditlog-viewer!t3034" -> null
      - app_name                    = "auditlog-viewer" -> null
      - global_account_id           = "41418523-412b-4980-9ea5-cd283792daa1" -> null
      - id                          = "1b4cac92-4bfa-424c-a093-78d281c1042b_auditlog-viewer_free" -> null
      - plan_name                   = "free" -> null
      - quota                       = 2 -> null
      - state                       = "SUBSCRIBED" -> null
      - subaccount_guid             = "1b4cac92-4bfa-424c-a093-78d281c1042b" -> null
      - subaccount_subdomain        = "1709384461986" -> null
      - subscribed_tenant_id        = "1b4cac92-4bfa-424c-a093-78d281c1042b" -> null
      - subscription_url            = "https://1709384461986.auditlog-viewer.cfapps.sap.hana.ondemand.com" -> null
      - supports_parameters_updates = false -> null
      - supports_plan_updates       = false -> null
    }

Plan: 0 to add, 3 to change, 2 to destroy.

Then the first action is:

{"@level":"info","@message":"module.assignment-1.cis_entitlements.entitlements: Destroying... [id=1b4cac92-4bfa-424c-a093-78d281c1042b]","@module":"terraform.ui","@timestamp":"2024-03-26T17:39:58.001349Z","hook":{"resource":{"addr":"module.assignment-1.cis_entitlements.entitlements","module":"module.assignment-1","resource":"cis_entitlements.entitlements","implied_provider":"cis","resource_type":"cis_entitlements","resource_name":"entitlements","resource_key":null},"action":"delete","id_key":"id","id_value":"1b4cac92-4bfa-424c-a093-78d281c1042b"},"type":"apply_start"}

So it start to delete the module.assignment-1 first.
Then the error occur in the provider when calling the API failed because of the internal implementation details.

IF Terraform would start with the destroy of the module.cis_saas_subscription.saas_subscription as expected the later destroy of the module.assignment-1 will succedded.

Just to be clear, those are the logs of the apply after deleting the modules from the configurations, I didn’t attach the logs to the first apply that created all resources when configurations had all the information. There. everything was ok and first the creation of the module.assignment-1 was happening and later the module.subscription-auditlog-viewer.

Regards,
Nimrod Oron

Yes, I understand the resource were removed from the configuration, but the error is being returned during the ReadResource rpc call, not during ApplyResourceChange. No destroy operations were being attempted.

The logs never show Terraform even beginning the apply process, this is all happening during the plan before any changes should be made by the provider.

This section:

{"@level":"info","@message":"module.assignment-1.cis_entitlements.entitlements: Destroying... [id=1b4cac92-4bfa-424c-a093-78d281c1042b]","@module":"terraform.ui","@timestamp":"2024-03-26T17:39:58.001349Z","hook":{"resource":{"addr":"module.assignment-1.cis_entitlements.entitlements","module":"module.assignment-1","resource":"cis_entitlements.entitlements","implied_provider":"cis","resource_type":"cis_entitlements","resource_name":"entitlements","resource_key":null},"action":"delete","id_key":"id","id_value":"1b4cac92-4bfa-424c-a093-78d281c1042b"},"type":"apply_start"}
{"@level":"info","@message":"module.assignment-1.cis_entitlements.entitlements: Still destroying... [10s elapsed]","@module":"terraform.ui","@timestamp":"2024-03-26T17:40:08.004690Z","hook":{"resource":{"addr":"module.assignment-1.cis_entitlements.entitlements","module":"module.assignment-1","resource":"cis_entitlements.entitlements","implied_provider":"cis","resource_type":"cis_entitlements","resource_name":"entitlements","resource_key":null},"action":"delete","elapsed_seconds":10},"type":"apply_progress"}
{"@level":"info","@message":"module.assignment-1.cis_entitlements.entitlements: Destruction errored after 13s","@module":"terraform.ui","@timestamp":"2024-03-26T17:40:10.821564Z","hook":{"resource":{"addr":"module.assignment-1.cis_entitlements.entitlements","module":"module.assignment-1","resource":"cis_entitlements.entitlements","implied_provider":"cis","resource_type":"cis_entitlements","resource_name":"entitlements","resource_key":null},"action":"delete","elapsed_seconds":13},"type":"apply_errored"}

Doesn’t it mean that the deletion is in progress already?

I think there is a problem with the log-full. Please look at the log (Only info messages) I will try to get a new log with trace enabled.

I did not read the short log closely, because it doesn’t have the information I need to confirm what the provider is doing. The full trace log does not make it to the point of applying, and looks like it’s erroring out during the plan.

The json output lines do look like cis_entitlements.entitlements was the only resource where destroy was attempted, but I can’t confirm how we got there. It is very difficult to parse this with all the provider logs, core logs, human readable output and json streaming output mixed together, so I may still be missing something.

If you can apply a plan with TF_LOG_CORE=trace that would show exactly what actions Terraform is taking, and in what order. If you can make the plan separately first it helps a lot too, because that both narrows down where the error occurred, and reduces a lot of the work that needs to be done and logged during apply.

Please have a look in this log.
This is the full apply log, The apply doesn’t contain a plan file so it first do the planning and then the apply.

apply-log.txt (1.7 MB)

In line 13456 you can see the log: 2024-03-26T19:49:26.498Z [DEBUG] module.assignment-1.cis_entitlements.entitlements: applying the planned Delete change

then in line 13460: 2024-03-26T19:49:26.501Z [TRACE] provider.terraform-provider-cis_v1.93.0_x3: Received request: tf_rpc=ApplyResourceChange @caller=/go/pkg/mod/github.com/hashicorp/terraform-plugin-go@v0.14.3/tfprotov5/tf5server/server.go:805 @module=sdk.proto tf_proto_version=5.3 tf_provider_addr=provider tf_req_id=6f82e28e-891e-bb42-955d-81404ac0a06e tf_resource_type=cis_entitlements timestamp=2024-03-26T19:49:26.500Z

line 13556: 2024-03-26T19:49:38.956Z [ERROR] vertex “module.assignment-1.cis_entitlements.entitlements (destroy)” error: Provider [cis] failed with the following error: failed to assign entitlements to subaccount ‘7133c524-5792-470d-b919-6fe39d8cb5cb’, entitlements response:

Thank you @nimrodoron, that log is much more complete, and can tell us at least partly why you are encountering this problem. A big key is that I did not realize there are multiple providers involved

There is an edge missing, but Terraform drops this purposely, indicated by the line

skipping inter-provider edge module.assignment-1.cis_entitlements.entitlements (destroy)->module.subscription-auditlog-viewer.cis_saas_subscription.saas_subscription (destroy) which creates a cycle

There exists an edge case where when a provider is situated in the graph between two resources from different providers which are being destroyed that can cause a cycle, because the the provider’s dependencies must exist longer than the execution of that provider. That log line corresponds only to the situation where there is definitively a cycle detected, so it’s not a false positive, although I don’t fully understand what is causing it in this situation yet.

Debugging that further will take some time given all the individual elements of this configuration. It may be possible to simplify this config further to help. Use of depends_on with a module is often a red flag that something is not correct, as that introduces dependencies between everything in one module to everything in another, when often only a subset of those dependencies are required. Passing an output from the required dependencies in one module into the other module where they are consumed may be able to reduce the interdependencies (though these modules don’t look too complex from what I can see)

Had a moment to trace the cycle, and the dependency chain which prevents the destroy nodes from being connected in the opposite direction is:

module.assignment-1.cis_entitlements.entitlements (destroy)
module.subaccount.cis_subaccount.cis_subaccount
module.subaccount.local.cis_subaccount
module.subaccount.module.cis_local_service_instance.var.subaccount_guid
module.subaccount.module.cis_local_service_instance.cis_entitlements.cis_local_entitlements
module.subaccount.module.cis_local_service_instance.sm_service_instance.cis_local_service_instance
module.subaccount.module.cis_local_service_instance.sm_service_binding.cis_local_service_binding
module.subaccount.module.cis_local_service_instance.local.output_credentials
module.subaccount.module.cis_local_service_instance.output.credentials
module.subaccount.output.cis_local_credentials
provider["api.cloud-automation-registry/providers/cis"].as-sa-admin
module.subscription-auditlog-viewer.cis_saas_subscription.saas_subscription (destroy)

In the rare case that a destroy chain has a provider in it, Terraform can usually discard that edge because resources from different providers would almost never have direct functional dependencies on one another requiring the strict destroy ordering. In this case however the different cis providers seem to be operating on shared resources.

Hi @jbardin ,
Indeed i was able to visualize the cycles when creating a graph with actual plan.
Not sure how to solve it yet but understood the problem now.

Thank you very much.

Regards,
Nimrod Oron

Hi @jbardin,
After simplifying the configurations this is the current situation:

Dependencies on apply:
subaccount → entitlement (implicit subaccount id) → *(resources needed to init provider for subscription and have implicit dependency to subaccount id) → subscription (explicit depends on entitltments and provider dependency on subaccount)

Now, there is a change in the subaccount (for example display name) and removal of both entitlements and subscription.

So in the new graph of apply:
entitlement → subaccount → *(resources needed to init provider for subscription and have implicit dependency to subaccount id) → subscription (provider depands on subaccount)
And now Terraform remove the edge subscription → entitlements which will cause the cycle.

Actually in the resources needed to init the prvoider for subscripiton there is no change, neither in the module output that encapsulate them so what i don’t want is the edges:
subaccount → *(resources needed to init provider for subscription and have implicit dependency to subaccount id) → subscription

Can i achieve it in some way? It is something like ignoring the output of a module on change for init the providers.

Regards,
Nimrod Oron

I don’t think I can give a solid answer without spending some time with the complete configuration. In general however, Terraform must setup all the dependencies before it can even get to the point of determining what has changed, so there is no way to avoid cycles based on whether there may be a change or not.

The only certain method for avoiding this type of problem, is to split the configurations up into discrete parts to be applied separately. The fact that a provider can be configured from managed resource outputs is something we’ve had to maintain for compatibility, but is discouraged because it can lead to intractable situations (usually regarding unknown inputs to the provider, but cycles are also inevitable).