dbt-presto
dbt-presto copied to clipboard
Support for incremental models
This is one of the killer DBT features and I think that it should be available in Presto as well. I will try to create a PR for this one but it might take a few weeks.
If there's anyone who is willing to implement this feature in less than 2 weeks, feel free to comment.
Thanks @buremba! How are you thinking about implementing incremental models? We like to use the merge
statement on Snowflake/BigQuery, but I don't believe that's possible on Presto. For databases that don't support merge
, we implement incremental models with a delete
and an insert
. You can find some more info on this approach here.
I think the delete + insert
approach will work, but it's unfortunately not atomic as I don't believe Presto supports transactions? Do you know how most Presto projects handle incremental updates to a table? My understanding is that this is typically handled with partitions, not insert/delete
statements
@drewbanin The only way to make it work in Presto is delete + insert
at the moment but Presto supports transactions to some extent. (https://prestodb.github.io/docs/current/sql/start-transaction.html)
The problem is that Presto is actually just a distributed query executor and under the hood, it has a concept of connectors which might be an RDBMS, S3, Hadoop, Elasticsearch, etc. AFAIK, only a few connectors support transactions so this feature won't be available in most of the connectors. I will look into the connectors and see which ones support transactions and will let you know.
Are there any updates on this?
Can we use Alter Table with a new location to simulate this? I believe the alter table statement is atomic. We will have to merge existing data with the incoming data and save it to a new location first, and use an alter table to point the original table to the new location. It has the drawback of not being space efficient(maybe a vacuum step) and only suited for infrequent batch uses
They are working on merge into for hive: https://github.com/trinodb/trino/issues/7708
@mdesmet I've opened an issue on the dbt-trino adapter and I'll try to have a proof-of-concept