dbt-databricks tries to establish connection to Databricks when running `dbt parse`
Describe the bug
Running dbt parse should, according to dbt documentation, work in an isolated environment with no Databricks workspace available. dbt parse documentation
Starting in v1.5, dbt parse will write or return a manifest, enabling you to introspect dbt's understanding of all the resources in your project. Since dbt parse doesn't connect to your warehouse, this manifest will not contain any compiled code.
This is especially useful when building CI/CD pipeline where you want to be able to generate a manifest.json file.
This used to work as expected with dbt-databricks, but it seems to have been broken in version 1.9.0.
Steps To Reproduce
Define some dummy http_path (or other) value in dbt profile and run dbt parse. Using dbt-databricks 1.8.7, that works as expected, whereas any version since 1.9.0 produces an unhandled exception.
Expected behavior
Generate a manifest.json without trying to connect to Databricks.
Screenshots and log output
Here is part of the traceback you get when running dbt parse
13:43:50 Traceback (most recent call last):
File "/workspaces/dbt/.venv/lib/python3.12/site-packages/dbt/cli/requires.py", line 138, in wrapper
result, success = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/dbt/.venv/lib/python3.12/site-packages/dbt/cli/requires.py", line 101, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/dbt/.venv/lib/python3.12/site-packages/dbt/cli/requires.py", line 218, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/dbt/.venv/lib/python3.12/site-packages/dbt/cli/requires.py", line 247, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/dbt/.venv/lib/python3.12/site-packages/dbt/cli/requires.py", line 294, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/dbt/.venv/lib/python3.12/site-packages/dbt/cli/requires.py", line 320, in wrapper
ctx.obj["manifest"] = parse_manifest(
^^^^^^^^^^^^^^^
File "/workspaces/dbt/.venv/lib/python3.12/site-packages/dbt/parser/manifest.py", line 1895, in parse_manifest
register_adapter(runtime_config, get_mp_context())
File "/workspaces/dbt/.venv/lib/python3.12/site-packages/dbt/adapters/factory.py", line 203, in register_adapter
FACTORY.register_adapter(config, mp_context, adapter_registered_log_level)
File "/workspaces/dbt/.venv/lib/python3.12/site-packages/dbt/adapters/factory.py", line 118, in register_adapter
adapter: Adapter = adapter_type(config, mp_context) # type: ignore
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/dbt/.venv/lib/python3.12/site-packages/dbt/adapters/databricks/impl.py", line 176, in __init__
super().__init__(config, mp_context)
File "/workspaces/dbt/.venv/lib/python3.12/site-packages/dbt/adapters/base/impl.py", line 271, in __init__
self.connections = self.ConnectionManager(config, mp_context)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/dbt/.venv/lib/python3.12/site-packages/dbt/adapters/databricks/connections.py", line 712, in __init__
super().__init__(profile, mp_context)
File "/workspaces/dbt/.venv/lib/python3.12/site-packages/dbt/adapters/databricks/connections.py", line 385, in __init__
self.api_client = DatabricksApiClient.create(creds, 15 * 60)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/dbt/.venv/lib/python3.12/site-packages/dbt/adapters/databricks/api_client.py", line 560, in create
credentials_provider = credentials.authenticate(None)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/dbt/.venv/lib/python3.12/site-packages/dbt/adapters/databricks/credentials.py", line 262, in authenticate
self._credentials_provider = provider.as_dict()
System information
The output of dbt --version:
Core:
- installed: 1.9.2
- latest: 1.9.2 - Up to date!
Plugins:
- spark: 1.9.1 - Up to date!
- databricks: 1.9.4 - Up to date!
Debian Bookworm Python 3.12.8
Additional context
While trying to identify the origin of the issue, I was able to install the combination dbt-core==1.8.8, dbt-databricks==1.9.4 which gives the error above, while the combination dbt-core==1.8.8, dbt-databricks==1.8.7 doesn't, which seems to confirm that it is not a change to dbt-core, but a change to dbt-databricks that is the root cause.
I have not been able to pinpoint a specific change since there has been a significant refactor between these two versions. I suspect that the issue might be there: https://github.com/databricks/dbt-databricks/blob/40c23374210f814334bafe59ec03e5bf18c5d86b/dbt/adapters/databricks/connections.py#L186 This seems to have been introduced by #849.
It looks like DatabricksConnectionManager calls DatabricksApiClient.create in its initializer, which in turns establishes a connection to Databricks, maybe that's a bit early in the process and should only happen later, when open is called?
According to the documentation of ConnectionManager:
open()is a classmethod that gets a connection object (which could be in any state, but will have a Credentials object with the attributes you defined above) and moves it to the 'open' state.`
I would therefore not expect the connection to be opened before this function is called.
Thanks for reporting. Will discuss with dbt-core.
Thank you 😊
@benc-db I see you have a fix for this. When do you plan to release a new version with this fix?
@benc-db I'm having the exact same problem but with dbt-databricks 1.10.12:
`Core:
- installed: 1.10.11
- latest: 1.10.11 - Up to date!
Plugins:
- databricks: 1.10.12 - Up to date!`
It keeps trying to open a connection to databricks when running dbt-parse.
@mmonteiro18 can you share a stack trace so that I can find where it's happening? I haven't been able to repro.