dbt-trino
dbt-trino copied to clipboard
Incremental loads don't work where schema is different. Implement on_schema_change
Hey guys,
If your destination table doesn't have the same columns as your model the whole model fails. Example
If Table A has ID, Text1, Date1 in it's field list and the dbt model has TableA as ID, Text1 on dbt run the model will fail regardless of the on_schema_change setting. It seems the incremental macro retrieves the list of columns to insert into from the destination table and never checks them against the model for schema diffs?
Relevant documentation: https://docs.getdbt.com/docs/building-a-dbt-project/building-models/configuring-incremental-models#what-if-the-columns-of-my-incremental-model-change
I've sorted my particular issue with an override macro using your function:
{% macro dbt_trino_get_append_sql(tmp_relation, target_relation) %}
{%- set dest_columns = adapter.get_columns_in_relation(tmp_relation) -%}
{%- set dest_cols_csv = dest_columns | map(attribute='quoted') | join(', ') -%}
insert into {{ target_relation }} ({{ dest_cols_csv }})
select {{dest_cols_csv}} from {{ tmp_relation.include(database=true, schema=true) }};
drop table if exists {{ tmp_relation }};
{% endmacro %}
Had to switch to use the tmp_relation to retrieve the columns and then add a list of fields to insert into otherwise data type mismatches happen when inserting ordinally I did want to use an intersect function to only get fields that were in both destination and tmp but there aren't any built in jinja functions. I might throw a PR in to extend but for now this suits my needs
@prgx-aeveri01 would you like to open a PR to fix incremental macro? Your snippet looks like a fix for the issue.
Done! I haven't update the .readme in case other changes were getting wrapped up into the released version. If you want me to do that, feel free to add comments into the PR
https://github.com/starburstdata/dbt-trino/pull/49