paimon
paimon copied to clipboard
[spark] Support scan mode when query with incremental_between
Purpose
Linked issue: close #xxx
When querying with paimon_incremental_between_timestamp, we want to switch among different scan modes.
- delta or changelog, if every single change is needed.
- diff, if merge is needed.
In our micro/small batch operation, executed in hourly, we need to know when a record INSERTEd, and also when it's DELETEd. This needs deleta or changelog mode. If with diff mode, the +I and -D operation could be merged, then we won't get the -D operation.
But when quering the main table, not the audit_log table, the merged result is expected, to just get the INSERTEd and UPDATEd records. So we also need the diff mode.
Tests
API and Format
Documentation
Can you try setting spark.paimon.incremental-between-scan-mode = xx