dbt-databricks icon indicating copy to clipboard operation
dbt-databricks copied to clipboard

Sources ignore default catalog 1.7.4 previously was working

Open kmarq opened this issue 1 year ago • 4 comments
trafficstars

Describe the bug

DBT Cloud In dbt-databricks adapter version 1.5 we were not providing a Catalog in the connection settings. The catalog was defaulted on the SQL warehouse. Update testing for 1.7.4 and we are finding that for sources the warehouse default is ignored.

Models are building in the correct default catalog, but sources always look in hive_metastore.

Setting the catalog in the connection corrects this behavior, but dbt should respect the default catalog set in Databricks if one is set

Steps To Reproduce

Define a source with schema but no database/catalog Downstream model using that source Leave catalog null in connection Set warehouse default catalog to something other than hive_metastore where the data is

Source will attempt to access hive_metastore

Expected behavior

A clear and concise description of what you expected to happen.

Screenshots and log output

Dev environment: warehouse default catalog = dev_products, null catalog in DBT Databricks connection Logs: select current_catalog() use catalog dev_products show table extended in dev_products.default_core like '*' use catalog hive_metastore describe extended hive_metastore.raw.mrcgroupdim

after updating connection in dbt to set catalog to dev_products: select current_catalog() show table extended in dev_products.default_core like '*' describe extended dev_products.raw.mrcgroupdim

System information

14:27:07 Running with dbt=1.7.6 14:27:09 Registered adapter: databricks=1.7.4

DBT Cloud

kmarq avatar Jan 31 '24 14:01 kmarq

Thanks for the report, will investigate.

benc-db avatar Jan 31 '24 21:01 benc-db

Can you validate, do you have this same problem with 1.7.3? I think I know the cause, but getting that additional data point would help significantly. I'm pretty sure this is related to changes we had to make in 1.7.0 due to catalog and schema becoming non-optional on Credentials, so if that is the issue (in which it would also repro on 1.7.3), the fix might be a little involved.

benc-db avatar Feb 01 '24 16:02 benc-db

@benc-db Unfortunately it looks like DBT Cloud only let's me select the major version and won't let me specify a minor version. Not seeing anything in the docs that I can use to override that. We have a workaround in place now by specifying the catalog explicitly, but may want to make sure that this scenario is noted in the upgrade notes. Also, I see 1.7.4 of the dbt-databricks releases is marked pre-release now, but looks like DBT Cloud is still pulling it in. Not sure if there is something required between Databricks and DBT to coordinate rollback items like this. From triggering a run just now to confirm: 16:49:13 Registered adapter: databricks=1.7.4

kmarq avatar Feb 01 '24 16:02 kmarq

Yeah, unfortunately there is required coordination. 1.7.4 has been pulled because it does not play nicely with cold-starting non-serverless SQL Warehouses. Working on fixing that at high priority so that next week dbt Cloud can pick up a good version.

benc-db avatar Feb 01 '24 17:02 benc-db