dbt-databricks
dbt-databricks copied to clipboard
show_views macro fails on subsequent executions of the dbt run command
Describe the bug
This is a strange one. I'm currently trying to set up a DBT workflow where each new feature branch of a DBT project would auto-provision a new catalog in Databricks Unity Catalog to isolate development changes until they are ready to merge into production. To facilitate the workflow, I have set up the DBT project's profiles.yml file like so:
sample_project:
target: dev
outputs:
dev:
catalog: "sample_project_{{ env_var('DBT_CATALOG_SUFFIX') | replace('-', '_') | replace('/', '_') }}"
host: "{{ env_var('DBT_HOST') }}"
http_path: "{{ env_var('DBT_HTTP_PATH') }}"
schema: sample_project
token: "{{ env_var('DBT_ENV_SECRET_TOKEN') }}"
type: databricks
prod:
catalog: production
host: "{{ env_var('DBT_HOST') }}"
http_path: "{{ env_var('DBT_HTTP_PATH') }}"
schema: sample_project
token: "{{ env_var('DBT_ENV_SECRET_TOKEN') }}"
type: databricks
Where the catalog for the dev target is dynamically generated by appending an environment variable value to it, everything seems to work fine on the first execution of the dbt run command when I invoke it with the following flags:
dbt run --profiles-dir . --target dev
However, on subsequent runs of the same command without any changes, I get the following error:
17:14:34 Encountered an error:
Runtime Error
Runtime Error
Runtime Error
Fail to execute the command as the target schema `sample_project_feature_DA_2000.sample_project` is not in the current catalog. Please set the current catalog with 'USE CATALOG sample_project_feature_DA_2000' first.
After looking through the logs, I see this output:
[0m13:14:34.410049 [debug] [ThreadPool]: Databricks adapter: Error while running:
/* {"app": "dbt", "dbt_version": "1.6.0", "dbt_databricks_version": "1.6.1", "databricks_sql_connector_version": "2.7.0", "profile_name": "sample_project", "target_name": "dev", "connection_name": "list_sample_project_feature_DA_2000_sample_project"} */
show views in `sample_project_feature_DA_2000`.`sample_project`
It seems that the Databricks adapter is trying to run the SHOW VIEWS query in Databricks, and the docs seem to indicate that it doesn't support passing the catalog name in with the schema name.
I was able to solve the problem locally by modifying the source code for the macro from:
{% macro databricks__show_views(relation) %}
{% call statement('show_views', fetch_result=True) -%}
show views in {{ relation }}
{% endcall %}
{% do return(load_result('show_views').table) %}
{% endmacro %}
to this:
{% macro databricks__show_views(relation) %}
{% set relation_parts = "{}".format(relation).split('.') %}
{% call statement('use_catalog', fetch_result=True) -%}
use catalog {{ relation_parts[0] }}
{% endcall %}
{% call statement('show_views', fetch_result=True) -%}
show views in {{ relation_parts[1] }}
{% endcall %}
{% do return(load_result('show_views').table) %}
{% endmacro %}
I'm sure there is a more elegant solution to this problem.
Steps To Reproduce
- Setup a DBT project with Databricks Unity Catalog
- Configure profiles.yml to use a dynamic catalog in the target
- Ensure the development catalog is created and the user you run the dbt process with has USE CATALOG, CREATE SCHEMA, and USE SCHEMA permissions in the catalog
- execute dbt run and receive a successful result
- execute dbt run again without any changes and receive a failure
Expected behavior
After the initial successful invocation of the dbt run command, any subsequent dbt run command invocations without changes should result in another successful result rather than a failure.
System information
The output of dbt --version
:
Core:
- installed: 1.6.0
- latest: 1.6.0 - Up to date!
Plugins:
- databricks: 1.6.1 - Up to date!
- spark: 1.6.0 - Up to date!
The operating system you're using: Apple M2 Pro macOS 13.4.1
The output of python --version
:
Python 3.10.10
Additional context
My sample project has one model that materializes a table and one model that creates a view. I added so much context about the dynamic catalog in the profiles.yml because when I ran the same steps with a static catalog in my profiles.yml file, this bug did not occur.
While this does look like a bug, I'm wondering why you are dynamically making catalogs rather than schemas?
Would you mind submitting a PR with your solution? Given the limitations you described, it seems reasonable to me.
This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue.