dbt-databricks
dbt-databricks copied to clipboard
1.7.0+ always drop/creates streaming tables, rather than refreshing them if possible
Describe the bug
As of 1.7, streaming tables are always dropped when model materialized as 'streaming_table' is run, rather than simply refreshed.
Steps To Reproduce
- create a streaming table model, e.g.
{{ config(
materialized = 'streaming_table',
)}}
SELECT *, CURRENT_TIMESTAMP() as processed_time
FROM STREAM read_files( '{{ var("my_s3_bucket") }}', format => 'json' )
Run the model multiple times. On every run after the first, dbt will output:
Dropping relation <relation name> because it is of type table
and issue a "drop table" statement, followed by a "create streaming table" statement, rather than just a "create or refresh streaming table"
Expected behavior
streaming stable model runs against previously create streaming tables should cause a refresh, not a recreate (except for changes that for a full-refresh, of course)
Screenshots and log output
If applicable, add screenshots or log output to help explain your problem.
System information
The output of dbt --version:
Various - Initially discovered with dbt-databricks and dbt-core 1.7.3,
but verified that it first occurse in dbt-databricks 1.7.0
The operating system you're using:
OSX Ventura 13.3.1
The output of python --version:
Python 3.10.11
Additional context
It appears to me the issue is in the _parse_type method introduced in PR 499, the body of which is:
def _parse_type(self, information: str) -> str:
type_entry = [
entry.strip() for entry in information.split("\n") if entry.split(":")[0] == "Type"
]
return type_entry[0] if type_entry else "
The return value from this method is being compared to "STREAMING_TABLE", however as written it returns "Type: STREAMING_TABLE". The return value needs to be split to solve it, though that is not particularly pretty:
type_entry = [
entry.split(":")[1].strip() for entry in information.split("\n") if entry.split(":")[0] == "Type"
]
_parse_type only seems to be called for this particular check, so there shouldn't be side effects of this change, but I have not tested for that at all.
Thanks for the report and the debugging. Will incorporate the fix into the next release; due to holidays, however, this release will probably not be available until after New Year.
Sorry this took so long; please give 1.8.0b1 a shot: https://github.com/databricks/dbt-databricks/discussions/595
This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue.