dbt-spark icon indicating copy to clipboard operation
dbt-spark copied to clipboard

Allow sync all columns for Delta incremental models

Open Jeremynadal33 opened this issue 1 year ago • 4 comments
trafficstars

resolves dbt-labs/dbt-adapters#509 docs dbt-labs/docs.getdbt.com/#

Problem

When having an incremental model with Delta file format, we could not use the sync_all_columns as on_schema_change because in the beginning, Delta did not support dropping columns. It now can (see issue 594 for details) but only when having certains table properties.

Solution

As suggested in issue 594, I added a check on current table properties, comparing it to expected table properties for allowing dropping columns. It then raises an error if trying to remove columns with right table properties.

If table properties are correct, it then first add new columns and then remove columns.

DISCLAIMERS :

  • There is a repeating step, I don't know if I should refacto this into another macro ?
  • I did not add unit tests for this new behavior. I saw PR dbt-labs/dbt-spark#593 but I did not have time to understand the process yet. If possible, I would love an explanation on this, else I will try to understand on my own when I can!

Any feedback on this one will be much appreciated since this is my first open source PR on such a big project, thanks in advance !

Checklist

  • [x] I have read the contributing guide and understand what's expected of me
  • [x] I have run this code in development and it appears to resolve the stated issue
  • [ ] This PR includes tests, or tests are not required/relevant for this PR
  • [ ] This PR has no interface changes (e.g. macros, cli, logs, json artifacts, config files, adapter interface, etc) or this PR has already received feedback and approval from Product or DX

Jeremynadal33 avatar Aug 09 '24 16:08 Jeremynadal33

Any update on this PR? We have implemented an override macro as a workaround but would be nice if this gets fixed in the adapter code base.

sp-cveeragandham avatar Aug 28 '24 22:08 sp-cveeragandham

Hi there, is there an update on this PR? We have the same issue with incremental models in databricks in https://github.com/databricks/dbt-databricks/issues/780 , thanks.

ggng-jaz avatar Sep 09 '24 13:09 ggng-jaz

I apologize, I don't have time to investigate more currently. If anyone wants to add its stone, feel free to copy the code or tell me and I will give you access to my repo so you can push... Otherwise, I will come back to this as soon as I have enough time 🙏

Jeremynadalmirakl avatar Sep 09 '24 13:09 Jeremynadalmirakl

Iceberg also supports dropping columns so ideally this change could incorporate that too?

PhilHenson82 avatar Jan 30 '25 10:01 PhilHenson82

This PR has been marked as Stale because it has been open with no activity as of late. If you would like the PR to remain open, please comment on the PR or else it will be closed in 7 days.

github-actions[bot] avatar Jul 30 '25 02:07 github-actions[bot]

maybe we should close this properly: i tested that successfully on march 27th, the config triggers ALTER TABLE ... DROP COLUMNS (...)

i just don't see a commit/release that was responsible...

data-blade avatar Jul 31 '25 19:07 data-blade