paimon icon indicating copy to clipboard operation
paimon copied to clipboard

[spark] Support scan mode when query with incremental_between

Open JackeyLee007 opened this issue 8 months ago • 1 comments

Purpose

Linked issue: close #xxx

When querying with paimon_incremental_between_timestamp, we want to switch among different scan modes.

  • delta or changelog, if every single change is needed.
  • diff, if merge is needed.

In our micro/small batch operation, executed in hourly, we need to know when a record INSERTEd, and also when it's DELETEd. This needs deleta or changelog mode. If with diff mode, the +I and -D operation could be merged, then we won't get the -D operation.

But when quering the main table, not the audit_log table, the merged result is expected, to just get the INSERTEd and UPDATEd records. So we also need the diff mode.

Tests

API and Format

Documentation

JackeyLee007 avatar Apr 02 '25 14:04 JackeyLee007

Can you try setting spark.paimon.incremental-between-scan-mode = xx

Zouxxyy avatar Apr 03 '25 02:04 Zouxxyy