Dynamic task foreach order changes

I use foreach for dynamic tasks (databricks) as following:

locals {
  md_tables = [ "table1",
              "table2",
              "....",
              "table n",
              ]
}
resource "databricks_job" "lakehouse-data" {
  name = "master_data_${var.catalog}"  
  j.......
######bronze tasks
  dynamic "task" {
    for_each = local.md_tables
    content {
        task_key = join("-", ["bronze",replace(task.value, ".", "-")])
        job_cluster_key = "j"
        notebook_task {
          notebook_path = var.stream_notebook_path
          base_parameters={
                "input_source": var.lake_input_path
                "output_path": var.output_path,
                "orig_table_name": task.value,
                "catalog": var.catalog
          } 
        }
    }
  }
....

the problem is that every time i execute the plan, the order of the tasks is changed, and it apears as if the resources are changed, while they are not. its very difficult to track the changes this way. is there away to make sure that the order would be guarenteed?

1 Like

Hi @amit.cahanovich,

Without seeing the actual plan output I can only guess what’s going on here, but one relatively-common cause for behavior like this is when a provider doesn’t correctly normalize changes made by the API it is wrapping.

Assuming this is the databricks/databricks provider, I can see in its source code that the task block type is declared in the provider schema as having a meaningful order, and so the internal representation of this series of blocks is a list of objects.

For that to be a correct data type to use, the remote API would need to be able to remember the exact declaration order for tasks and return them in exactly the same order during the “read” operation.

Sometimes provider developers use a list type even though the remote API cannot preserve the order, and so then on the next “read” call (which happens automatically during the next Terraform operation) the provider returns the task blocks in a different order and Terraform assumes that change in order was meaningful because the schema says that tasks are ordered.

If this is the problem in your case then the symptom I would expect is for the plan to show that the order of the task blocks will be changed from some other order back into the order you listed them in local.md_tables. The provider has told Terraform that the order changed outside of Terraform but then during the planning step proposes to change them back into your declaration order so that the value will match the configuration.

If this is what you are seeing then unfortunately the real fix here will need to be inside the provider itself, and so I would normally suggest reporting a bug in the provider’s GitHub repository. When I checked I saw this existing issue which I think you also created, though:

In the meantime there are some possible workarounds for this kind of problem. You should only choose one of the following because they are all independent workarounds:

  1. Change your local.md_tables value to list the tables in the same order that the provider is returning them during the “read” operation, so that the configuration will match the order and so the provider won’t propose to make any changes during planning.

    This is the most robust solution but it will only work if the provider is returning the tasks in a consistent order on every read. If the remote API returns the tasks in a different order on each call, and the provider doesn’t impose a predictable ordering on them before returning them, then this workaround won’t work because the order will change every time.

  2. In your resource "databricks_job" block, add a lifecycle block and inside that block write ignore_changes = [task], to tell Terraform to ignore any differences between the configuration and the apparent current state for the task blocks.

    This workaround will avoid proposing any updates for task blocks at all, so you won’t be able to change the set of tasks for an object once it’s been created unless you temporarily remove the ignore_changes setting and then restore it afterwards.

  3. Run terraform plan or terraform apply with the -refresh=false option to disable the “read” step which is probably what is making the task blocks become incorrectly ordered.

    This option should be a last resort because it will prevent Terraform from detecting any changes made in the remote system, and so the generated plan might be invalid if something does change outside of Terraform. This includes all resources in your configuration, not just the databricks_job resource.

Hi @amit.cahanovich , I implemented something similar and encountered the same problem: tasks were always flagged as changing. What worked for me: change the md_tables list to a map, with the task_name as the key. The map behaves as sorted by key. However, if I add a new element, that will still cause a reorder.

Hi @apparentlymart , how are you? Hope you are doing fine.

I’m new on Terraform and I’m deploying the workflow jobs from Databricks with Terraform modules, to get the resource content from the saved json in folder.

I had the same problem with the tasks orders, because our team just can’t all the tasks ordered, so maybe a solution is to sort the list of tuples. Could you give me a hint how to do that? I spend a lot of time trying without success.

Json example:

{
  "name": "notebook-job-test",
  "email_notifications": {},
  (...)
  "tasks": [
    {
      "task_key": "task-job-two",
      "notebook_task": {
        "notebook_path": "/notebook/notebook-two
        "source": "WORKSPACE"
      },
      "job_cluster_key": "job_cluster_test",
      "timeout_seconds": 0,
      "email_notifications": {}
    },
    {
      "task_key": "task-job-one",
      "notebook_task": {
        "notebook_path": "/notebook/notebook-one",
        "source": "WORKSPACE"
      },
      "job_cluster_key": "job_cluster_test",
      "timeout_seconds": 0,
      "email_notifications": {
        "on_failure": [
          "ext-harlem.jose@bbts.com.br"
        ]
      }
    }
  ],
  "job_clusters": [{...}],
  "format": "MULTI_TASK"
}

In my module file I read all the folder containing the json jobs:

locals {
  job_files = fileset("../databricks/workflows/jobs/", "*.json")
  job_data  = [for f in local.job_files : jsondecode(file("../databricks/workflows/jobs/${f}"))]
}

And in my resource block every fields and blocks are being created dynamically from json files, but the last goal now is to order the tasks to avoid the behavior to suggest change in every plan and apply:

resource "databricks_job" "main" {
  for_each                  = { for f in local.job_data : f.name => f }
  name                      = lookup(each.value, "name", null)
  min_retry_interval_millis = lookup(each.value, "min_retry_interval_millis", null)
  always_running            = lookup(each.value, "always_running", null)
  tags                      = lookup(each.value, "tags", null)
  retry_on_timeout          = lookup(each.value, "retry_on_timeout", null)
  max_retries               = lookup(each.value, "max_retries", null)
  timeout_seconds           = lookup(each.value, "timeout_seconds", null)
  max_concurrent_runs       = lookup(each.value, "max_concurrent_runs", null)

  dynamic "task" {
    for_each = lookup(each.value, "tasks", null)[*]
    # for_each = tolist(each.value.tasks)[*]
    # for_each = {for task in each.value.tasks: task.task_key => task}
    content {
      task_key            = lookup(each.value.tasks[task.key], "task_key", null)
      existing_cluster_id = lookup(each.value.tasks[task.key], "existing_cluster_id", null)
      job_cluster_key     = lookup(each.value.tasks[task.key], "job_cluster_key", null)
      timeout_seconds     = lookup(each.value.tasks[task.key], "timeout_seconds", null)
      (...)
    }
  }
  (...)
}

So when a plan and apply command would be applied, the task block with task_key “task-job-one” should be created before than the task block with the task_key “task-job-two”. Could you help me please to figure out how to solve this issue?

Thank you now in advance.
Have a nice weekend.

Harlem Muniz

I am having the same issue, is there any update on this?

This issue seems to be resolved in the latest version when naming the tasks with numbers

01_task…
02_task…
03_task…

Thanks for sharing this information, @dtcMLOps.

Since this problem seems to live in the interaction between the Databricks provider and the Databricks API, I cannot comment with certainty since I’m not very familiar with either and in particular I cannot see the implementation of the Databricks API itself.

However, based on the behavior you’ve observed my guess would be that the Databricks API is ordering these items lexically by task_key when it responds. If that’s true then that suggests both a more specific workaround you could use with current releases and also a possible way that this could be fixed in the provider itself so that a config-level workaround would not be required.


The config-level workaround:

Make your dynamic block iterate over a map whose keys are the task_key strings. Terraform iterates maps in key-lexical order, so that should cause the order to match the remote API’s interpretation as long as the remote API is using the same definition of string ordering that Terraform does:

  dynamic "task" {
    for_each = {
      for t in wherever_tasks_come_from : t.task_key => t
    }
    content {
      # ...
    }
  }

whereever_tasks_come_from in the above is a placeholder for an expression that produces a list or set of objects that each represent a task. For example, in what @harlemmuniz shared that would be lookup(each.value, "tasks", null)[*].


The possible avenue for fixing the provider bug:

This is unfortunately not straightforward to fix with the current implementation details of the databricks provider since it’s implemented with the legacy Terraform plugin SDK and so the facilities for normalizing values during refresh and plan are quite limited and clunky to use.

In particular, the legacy SDK treats nested blocks as a sequence of entirely-separate values and doesn’t provide many facilities that allow treating the entire list of blocks together as a single value. The typical approach to API value normalization used in that SDK – the DiffSuppressFunc function – only works on an individual-attribute basis and so cannot “suppress the diff” for an entire set of blocks of a particular type.

Therefore a provider-level fix here will probably only be partial, but could in theory be implemented like this:

  • Add a CustomizeDiff function to the entire databricks_job resource type. This function gets to work on the entirety of a particular resource instance at the same time, so it can make global decisions including analyzing the entirety of a list of blocks.
  • Inside the CustomizeDiff function, use old, new := d.GetChange("task") (where d is ResourceDiff) to obtain both the old and new lists of blocks of that type.
  • Compare both lists to see if they contain the same objects but in a different order. If that’s true, call d.SetNew("task", old) to create a similar effect as DiffSuppressFunc but for the entire set of task blocks, rather than for an individual attribute inside one of them.
  • In the Read function for the databricks_job resource type, use similar logic to compare the prior state (from d.Get("task")) with the value read from the API. If the value read from the API has the same objects but in a different order, then totally skip calling d.Set("task", ...) so that the previous run’s result is preserved. This is similar to the effect of using DiffSuppressFunc along with DiffSuppressOnRefresh, which would normally make the SDK use DiffSuppressFunc both with the Read result and with the planned changes, to avoid creating errant “Objects changed outside of Terraform” noise in the plan.
1 Like