dbt-databricks icon indicating copy to clipboard operation
dbt-databricks copied to clipboard

'append_new_columns' flag causes failures for 'append' incremental models

Open TannerHopkins opened this issue 3 years ago • 2 comments

Describe the bug

When using the append_new_columns flag for incremental models with an append incremental strategy, the first dbt run adding new columns will fail. However, the next run and all runs after will succeed. The reason for this appears to be that new columns are successfully added, but the first insert statement after the table has been altered still uses the old columns (i.e., 3 instead of 4 columns).

Note that this does not happen for the merge incremental strategy.

Steps To Reproduce

1.) Configure an incremental model to with the append incremental_strategy and append_new_columns for on_schema_change, like so:

{{
    config(
        materialized = 'incremental'
        , incremental_strategy = 'append'
        , on_schema_change = 'append_new_columns'
    )
}}

2.) Execute an initial (full refresh) build of the model 3.) Add a new column 4.) Execute an incremental build of the model (it should fail) 5.) Execute an incremental build of the model again (it should succeed)

Expected behavior

This should succeed the first run after a schema change (new column), rather than fail the first time.

Screenshots and log output

Here's an example of what the error looks like: Cannot write to '[REDACTED].incremental_test', not enough data columns; target table has 4 column(s) but the inserted data has 3 column(s)

Example of databricks query log showing the insert failure after the alter table, then succeeding on the next insert statement: image

System information

The output of dbt --version:

dbtenv info:  Using dbt-databricks==1.1.0 (set by dbt_project.yml).
Core:
  - installed: 1.1.1
  - latest:    1.2.0 - Update available!

  Your version of dbt-core is out of date!
  You can find instructions for upgrading here:
  https://docs.getdbt.com/docs/installation

Plugins:
  - databricks: 1.1.0 - Update available!
  - spark:      1.1.0 - Update available!

  At least one plugin is out of date or incompatible with dbt-core.
  You can find instructions for upgrading here:
  https://docs.getdbt.com/docs/installation

(I realize this isn't the most up-to-date version, but I didn't see any closed issues related to this)

The operating system you're using: macOS Monterey v12.0.1

The output of python --version: Python 3.8.13

TannerHopkins avatar Aug 24 '22 16:08 TannerHopkins

Looks like this bug was recently raised over in the dbt-spark repo. Hopefully we can apply the same fix here.

TannerHopkins avatar Aug 25 '22 14:08 TannerHopkins

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue.

github-actions[bot] avatar Feb 22 '23 02:02 github-actions[bot]