sling-cli How to do incremental update with new rows where update

Suppose I have an Oracle table with two columns, ID (the primary key) and MDATUM (the update key).

ID	MDATUM
1	NULL
2	NULL
3	2024-09-18 02:00:00.000

Now I upload this table to Snowflake.

Next, one row is added to the table (ID=4), and one row is updated (ID=2), which means it gets an MDATUM:

so it looks like this:

ID	MDATUM	note
1	NULL
2	2024-09-18 03:00:00.000	this row is updated
3	2024-09-18 02:00:00.000
4	NULL	this row is new

How can get those two new rows into Snowflake using an incremental load?

When I load with

--mode incremental \
--primary-key 'ID' \
--update-key 'MDATUM' \

then I get this table:

ID	MDATUM	note
1	NULL
2	2024-09-18 03:00:00.000	correctly updated
3	2024-09-18 02:00:00.000

As you can see, ID=3 is updated as expected. The new row (ID=4), on the other hand, is not inserted into the target table.

I believe this is in line with your documentation.

However, from a practical standpoint, this is not helpful. My example above is a real-life example from our production table. And it makes sense: If a new row is added then that's not an update, and thus the update_key is empty. Only if a row is modified then the update_key is set, which makes sense.

I therefore believe there should be another load strategy that loads "new records after max(update_key)" plus "new records with primary_key larger then max(primary_key).

A workaround is I can make run Sling twice:

First, I load using these parameters:

--mode incremental \
--primary-key 'ID' \
--update-key 'MDATUM' \

This gets me all rows where the update_key MDATUM is newer then max(MDATUM) in my target table, i.e., all updated rows.

Then I run Sling again and this time I use:

--mode incremental \
--primary-key 'ID' \
--update-key 'ID' \

This should get me all IDs where the ID is higher then max(ID) in my target table, i.e., all new rows.

I feel my use-case is quite plausible and should be quite widespread.

What are your thoughts on this? And do you see a problem with my workaround?

Sep 18 '24 15:09 ArcqStrato

Would love to know any feed back on if using ID for both primary and update key worked?

Mar 19 '25 12:03 justin-fundrise

@justin-fundrise it works in practice. It will replace your existing records in place

Apr 01 '25 07:04 aleenprd

How to do incremental update with new rows where update_key is not set