dbt-databricks icon indicating copy to clipboard operation
dbt-databricks copied to clipboard

show_views macro fails on subsequent executions of the dbt run command

Open zayscue opened this issue 1 year ago • 3 comments

Describe the bug

This is a strange one. I'm currently trying to set up a DBT workflow where each new feature branch of a DBT project would auto-provision a new catalog in Databricks Unity Catalog to isolate development changes until they are ready to merge into production. To facilitate the workflow, I have set up the DBT project's profiles.yml file like so:

sample_project:
  target: dev
  outputs:
    dev:
      catalog: "sample_project_{{ env_var('DBT_CATALOG_SUFFIX') | replace('-', '_') | replace('/', '_') }}" 
      host: "{{ env_var('DBT_HOST') }}"
      http_path: "{{ env_var('DBT_HTTP_PATH') }}"
      schema: sample_project
      token: "{{ env_var('DBT_ENV_SECRET_TOKEN') }}"
      type: databricks
    prod:
      catalog: production
      host: "{{ env_var('DBT_HOST') }}"
      http_path: "{{ env_var('DBT_HTTP_PATH') }}"
      schema: sample_project
      token: "{{ env_var('DBT_ENV_SECRET_TOKEN') }}"
      type: databricks

Where the catalog for the dev target is dynamically generated by appending an environment variable value to it, everything seems to work fine on the first execution of the dbt run command when I invoke it with the following flags:

dbt run  --profiles-dir . --target dev

However, on subsequent runs of the same command without any changes, I get the following error:

17:14:34  Encountered an error:
Runtime Error
  Runtime Error
    Runtime Error
      Fail to execute the command as the target schema `sample_project_feature_DA_2000.sample_project` is not in the current catalog. Please set the current catalog with 'USE CATALOG sample_project_feature_DA_2000' first.

After looking through the logs, I see this output:

[0m13:14:34.410049 [debug] [ThreadPool]: Databricks adapter: Error while running:

/* {"app": "dbt", "dbt_version": "1.6.0", "dbt_databricks_version": "1.6.1", "databricks_sql_connector_version": "2.7.0", "profile_name": "sample_project", "target_name": "dev", "connection_name": "list_sample_project_feature_DA_2000_sample_project"} */
show views in `sample_project_feature_DA_2000`.`sample_project`

It seems that the Databricks adapter is trying to run the SHOW VIEWS query in Databricks, and the docs seem to indicate that it doesn't support passing the catalog name in with the schema name.

I was able to solve the problem locally by modifying the source code for the macro from:

{% macro databricks__show_views(relation) %}
  {% call statement('show_views', fetch_result=True) -%}
    show views in {{ relation }}
  {% endcall %}

  {% do return(load_result('show_views').table) %}
{% endmacro %}

to this:

{% macro databricks__show_views(relation) %}
  {% set relation_parts = "{}".format(relation).split('.') %}
  {% call statement('use_catalog', fetch_result=True) -%}
    use catalog {{ relation_parts[0] }}
  {% endcall %}
  {% call statement('show_views', fetch_result=True) -%}
    show views in {{ relation_parts[1] }}
  {% endcall %}

  {% do return(load_result('show_views').table) %}
{% endmacro %}

I'm sure there is a more elegant solution to this problem.

Steps To Reproduce

  1. Setup a DBT project with Databricks Unity Catalog
  2. Configure profiles.yml to use a dynamic catalog in the target
  3. Ensure the development catalog is created and the user you run the dbt process with has USE CATALOG, CREATE SCHEMA, and USE SCHEMA permissions in the catalog
  4. execute dbt run and receive a successful result
  5. execute dbt run again without any changes and receive a failure

Expected behavior

After the initial successful invocation of the dbt run command, any subsequent dbt run command invocations without changes should result in another successful result rather than a failure.

System information

The output of dbt --version:

Core:
  - installed: 1.6.0
  - latest:    1.6.0 - Up to date!

Plugins:
  - databricks: 1.6.1 - Up to date!
  - spark:      1.6.0 - Up to date!

The operating system you're using: Apple M2 Pro macOS 13.4.1

The output of python --version: Python 3.10.10

Additional context

My sample project has one model that materializes a table and one model that creates a view. I added so much context about the dynamic catalog in the profiles.yml because when I ran the same steps with a static catalog in my profiles.yml file, this bug did not occur.

zayscue avatar Aug 04 '23 18:08 zayscue

While this does look like a bug, I'm wondering why you are dynamically making catalogs rather than schemas?

benc-db avatar Sep 13 '23 16:09 benc-db

Would you mind submitting a PR with your solution? Given the limitations you described, it seems reasonable to me.

benc-db avatar Sep 13 '23 16:09 benc-db

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue.

github-actions[bot] avatar Apr 01 '24 01:04 github-actions[bot]