Databricks provider error: somehow resource id is not set

plieberg · January 4, 2022, 4:05pm

I’m writing an internal module for managing our Azure Databricks resources. The first iteration that simply created a workspace ran fine. However, I am now trying to add clusters and instance pools and running into an issue. It appears to be an Azure auth issue:

Error: cannot configure azure-client-secret auth: cannot get workspace: somehow resource id is not set. Attributes used: azure_client_id, azure_client_secret, azure_tenant_id. Please check https://registry.terraform.io/providers/databrickslabs/databricks/latest/docs#authentication for details

  with module.resources.data.databricks_spark_version.spark_version,
  on ../resources/data.tf line 12, in data "databricks_spark_version" "spark_version":
  12: data "databricks_spark_version" "spark_version" {

I included the proper depends_on statements for the data blocks and the cluster and instance pool, but still get this error. Based on some google searching, I tried moving the cluster and instance pool to a sub-module, but that produces the same error.

Provider versions:
databricks: “0.4.3”
azurerm: “2.90.0”

Terraform version: 1.0.4

tbugfinder · January 4, 2022, 5:53pm

Do you have the Databricks environment variables also in place?
https://registry.terraform.io/providers/databrickslabs/databricks/latest/docs#environment-variables

plieberg · January 4, 2022, 6:08pm

No, I am passing authentication info through the provider block:

provider "databricks" {
  alias               = "test_01"
  host                = module.test_module_01.workspace_url
  azure_tenant_id     = data.vault_generic_secret.subscription.data["tenant-id"]
  azure_client_id     = data.vault_generic_secret.subscription.data["sp-client-id"]
  azure_client_secret = data.vault_generic_secret.subscription.data["sp-client-secret"]

  # ARM_USE_MSI environment variable is recommended
  azure_use_msi = true
}

Then calling the provider in my module block:

module "test_module_01" {
  source = "../"

  providers = {
    databricks = databricks.test_01
  }
<truncated file for space>

tbugfinder · January 4, 2022, 6:15pm

plieberg:

provider "databricks" {
  alias               = "test_01"
  host                = module.test_module_01.workspace_url
  azure_tenant_id     = data.vault_generic_secret.subscription.data["tenant-id"]
  azure_client_id     = data.vault_generic_secret.subscription.data["sp-client-id"]
  azure_client_secret = data.vault_generic_secret.subscription.data["sp-client-secret"]

  # ARM_USE_MSI environment variable is recommended
  azure_use_msi = true
}

Hm, if azure_use_msi should be used, couldn’t azure_*_id variables be dropped?

plieberg · January 4, 2022, 6:54pm

Yes, that is true. I added them as a desperate attempt to fix my issue.

(Morgan Freeman voice) It did not.

tbugfinder · January 4, 2022, 10:11pm

This is the line of code which raises the error.

How about adding the azure_workspace_resource_id and verify contributor role within subscription?

plieberg · January 5, 2022, 6:11pm

Good idea, but that did not work.

hovhannes.hovakimyan · January 10, 2022, 1:19pm

@plieberg have you solved the issue? I’m asking as I have practically the same issue. And that issue started to happen within a week.

dvasdekis · January 12, 2022, 1:39am

Same for us. Started happening this week (just returned after break).

tbugfinder · January 12, 2022, 9:33pm

I’m wondering if this is related to the following GitHub issue with token {} support.

github.com/databrickslabs/terraform-provider-databricks

cannot create mws workspaces: cannot create token: cannot configure direct auth: host is empty, but is required by basic_auth. Environment variables used: DATABRICKS_HOST, DATABRICKS_USERNAME, DATABRICKS_PASSWORD

opened 07:32PM - 21 Dec 21 UTC

closed 11:42AM - 24 Dec 21 UTC

nfx

aws

* Using directly configured host+basic_auth authentication * Configured direct …auth: host=https://XXXXX.cloud.databricks.com, token=***REDACTED***, username=user@domain, password=***REDACTED***, azure_use_msi=false * GET /api/2.0/token/list * 200 OK {} <- GET /api/2.0/token/list * Creating client for host based on host=https://accounts.cloud.databricks.com, token=***REDACTED***, username=user@domain, password=***REDACTED***, azure_use_msi=false * Using basic auth for user 'user@domain' * error: cannot create mws workspaces: cannot create token: cannot configure direct auth: host is empty, but is required by basic_auth. Environment variables used: DATABRICKS_HOST, DATABRICKS_USERNAME, DATABRICKS_PASSWORD. Please check https://registry.terraform.io/providers/databrickslabs/databricks/latest/docs#authentication for details

plieberg · January 13, 2022, 3:29pm

No, this is still an issue.

ekhaydarov · January 18, 2022, 8:51am

Ive been scratching my head at this for a few days now too. For me personally the issue seems to be triggered when two conditions are met. a databricks workspace is created with azurerm in a version that is before the introduction of public_network_access_enabled. Then upgrade azurerm version and it enforces that attribute to true, breaking the databricks provider for inexplicable reasons. I am currently working on figuring out if its possible to set public_network_access_enabled = true and not have the databricks provider be broken. it also forces azurerm_storage_data_lake_gen2_path to be recreated which is not good either.

It is absolutely a bug in azurerm. I am tired of their half baked products. I hope this saves others some time

plieberg · January 18, 2022, 7:36pm

@ekhaydarov Do you know what version of azurerm that was? Trying to see if I backlevel to that version, if I can get this module I am building to work.

My latest attempt was to create the workspace with my module, but then come back and add straight live resource code to create an instance pool and it fails for the same reason, even with depends_on specified for the data blocks and the live resources.

plieberg · January 19, 2022, 4:05pm

I opened a github issue for this, if you would like to go hit the thumbs up or comment:

github.com/databrickslabs/terraform-provider-databricks

[ISSUE] Cannot add instance pool or cluster after workspace was already created

opened 12:00AM - 19 Jan 22 UTC

plieberg

lazy auth

I’m writing an internal module for managing our Azure Databricks resources. The …first iteration that simply created a workspace ran fine. However, I am now trying to add clusters and instance pools and running into an issue. To remove my module from the equation, I simply created some live code resources and am running into the same issue. As you can see, I've added the depends_on to the data resources as well as the cluster/instance_pool resources. ### Configuration ```hcl resource "azurerm_resource_group" "example" { name = "example-resources" location = "eastus2" } resource "azurerm_databricks_workspace" "example" { name = "databricks-test" resource_group_name = azurerm_resource_group.example.name location = azurerm_resource_group.example.location sku = "premium" managed_resource_group_name = azurerm_resource_group.example.location custom_parameters { #Optional no_public_ip = true virtual_network_id = data.azurerm_virtual_network.dma_vnet.id public_subnet_network_security_group_association_id = data.azurerm_subnet.dma_subnet_dbpub.id private_subnet_network_security_group_association_id = data.azurerm_subnet.dma_subnet_dbpri.id ## Required if virtual_network_id is defined public_subnet_name = data.azurerm_subnet.dma_subnet_dbpri.name private_subnet_name = data.azurerm_subnet.dma_subnet_dbpub.name } } data "databricks_node_type" "smallest" { depends_on = [ azurerm_databricks_workspace.example, ] } resource "databricks_instance_pool" "smallest_nodes" { instance_pool_name = "Smallest Nodes" min_idle_instances = 0 max_capacity = 300 node_type_id = data.databricks_node_type.smallest.id idle_instance_autotermination_minutes = 10 disk_spec { disk_type { azure_disk_volume_type = "PREMIUM_LRS" } disk_size = 80 disk_count = 1 } depends_on = [ azurerm_databricks_workspace.example, ] } ``` ### Expected Behavior What should have happened? Instance pool is created ### Actual Behavior What actually happened? Errors, see below ### Steps to Reproduce Please list the steps required to reproduce the issue, for example: 1. `terraform plan` or `terraform apply` ### Terraform and provider versions ``` Terraform v1.0.4 on linux_amd64 + provider registry.terraform.io/databrickslabs/databricks v0.4.5 + provider registry.terraform.io/hashicorp/azuread v2.15.0 + provider registry.terraform.io/hashicorp/azurerm v2.92.0 + provider registry.terraform.io/hashicorp/external v2.2.0 + provider registry.terraform.io/hashicorp/http v2.1.0 + provider registry.terraform.io/hashicorp/null v3.1.0 + provider registry.terraform.io/hashicorp/random v3.1.0 + provider registry.terraform.io/hashicorp/time v0.7.2 + provider registry.terraform.io/hashicorp/vault v3.1.1 ``` Please paste the output of `terraform version`. If version of `databricks` provider is not the latest (https://github.com/databrickslabs/terraform-provider-databricks/releases), please make sure to use the latest one. ### Debug Output Please add turn on logging, e.g. `TF_LOG=DEBUG terraform apply` and run command again, paste it to gist & provide the link to gist. If you're still willing to paste in log output, make sure you provide only relevant log lines with requests. It would make it more readable, if you pipe the log through `| grep databricks | sed -E 's/^.* plugin[^:]+: (.*)$/\1/'`, e.g.: This is the error during the plan step, if I try to create a cluster resource: ``` Error: workspace is most likely not created yet, because the `host` is empty. Please add `depends_on = [databricks_mws_workspaces.this]` or `depends_on = [azurerm_databricks_workspace.this]` to every data resource. See https://www.terraform.io/docs/language/resources/behavior.html more info. Please check https://registry.terraform.io/providers/databrickslabs/databricks/latest/docs#authentication for details with data.databricks_spark_version.latest_lts, on databricks_workspace_pll.tf line 32, in data "databricks_spark_version" "latest_lts": 32: data "databricks_spark_version" "latest_lts" { ``` This is the error from an apply if I try to add an instance_pool resource: ``` Error: cannot create instance pool: authentication is not configured for provider.. Please check https://registry.terraform.io/providers/databrickslabs/databricks/latest/docs#authentication for details with databricks_instance_pool.smallest_nodes, on databricks_workspace_pll.tf line 32, in resource "databricks_instance_pool" "smallest_nodes": 32: resource "databricks_instance_pool" "smallest_nodes" { ``` If Terraform produced a panic, please provide a link to a GitHub Gist containing the output of the `crash.log`. ### Important Factoids Are there anything atypical about your accounts that we should know?

tbugfinder · February 18, 2022, 9:29pm

Today I also run into this authentication mess and I think it got fixed by updating azure-cli.

nfx · May 19, 2022, 8:05am

please raise issue on databricks provider issue tracker if this behavior persists.

Attributes used: azure_client_id, azure_client_secret, azure_tenant_id. Please check Terraform Registry for details

i think that you should have added the service principal to databricks workspace. Technically, Azure Databricks requires azure_workspace_resource_id in headers only the first time SPN makes a call to Azure Databricks APIs, but i’ve tried to make it explicit in terraform provider.

kantegajorgen · November 20, 2022, 9:00am

I have struggled with similar issue for a couple of days now. I have no infrastructure in place before I try to run terraform apply. It failed on one of my data-resources trying to fetch latest_lts. But without a workspace, there would be no endpoint to get this from.

After reading the troubleshooting guide (link):

Most data resources make an API call to a workspace. If a workspace doesn’t exist yet, authentication is not configured for provider error is raised. To work around this issue and guarantee a proper lazy authentication with data resources, you should add depends_on = [azurerm_databricks_workspace.this] or depends_on = [databricks_mws_workspaces.this] to the body. I added a depends_on to the data-resources. After this it went through.

data "databricks_node_type" "smallest" {
  local_disk = true
  depends_on = [azurerm_databricks_workspace.my_workspace]
}

data "databricks_spark_version" "latest_lts" {
  long_term_support = true
  depends_on = [azurerm_databricks_workspace.my_workspace]
}

Topic		Replies	Views
Databricks fails with authentication error while using AzureRM 2.69.0 and above Terraform	1	1215	December 11, 2021
Error: cannot create cluster: authentication is not configured for provider Terraform Providers	1	676	May 31, 2022
Same resource provider different configuration for re-authentication Terraform Providers azure	0	239	March 23, 2022
Integrate Gitlab Repos to Azure Databricks workspace HCP Terraform	1	1458	December 11, 2021
Error: cannot create cluster: authentication is not configured for provider2 Terraform Providers	2	587	February 12, 2024

Databricks provider error: somehow resource id is not set

Related topics