Provider instances are certainly part of the dependency graph already, but I see where the confusion lies now.
There is no hidden discovery going on here, Terraform is using the dependencies declared in the configuration itself. The databricks resources all depend on their provider, which in turn depends on
azurerm_databricks_workspace.this.workspace_url. This dependency is not missing from the graph, what is changing in the configuration is the behavior of the data source when you add
Data sources are intended to be read as soon as possible during the planning process, so that their attributes can be used for planning purposes. However, the
databricks provider in your example requires the value from a managed resource which can’t be known until after the apply is complete. This means that by default
data.databricks_current_user.me will attempt to read the data during planning, but using an incomplete configuration (this is allowed for compatibility with providers than can plan correctly with incomplete configuration).
If the resulting configuration still works as desired, then adding
depends_on is the correct workaround for this situation. From the Data Resource Dependencies documentation:
depends_on meta-argument within
data blocks defers reading of the data source until after all changes to the dependencies have been applied
In most cases, this style of multi-level infrastructure is better suited to be applied from multiple configurations, so that you can ensure the required azurem infrastructure is in place before building out the databricks infrastructure on top of it. If combining the configurations works for your purpose, you can continue using it in this way, but understanding where the individual layers are separated conceptually will help diagnose similar issues that may arise.