velox icon indicating copy to clipboard operation
velox copied to clipboard

Support reading Iceberg split with equality deletes

Open yingsu00 opened this issue 1 year ago • 3 comments

This PR introduces EqualityDeleteFileReader, which is used to read Iceberg splits with equality delete files.

yingsu00 avatar Sep 25 '24 05:09 yingsu00

Deploy Preview for meta-velox canceled.

Name Link
Latest commit 8b0ed7ff524718d4b5e96c83142ddaef1253a6fa
Latest deploy log https://app.netlify.com/projects/meta-velox/deploys/68884d67fb888500089e9bf9

netlify[bot] avatar Sep 25 '24 05:09 netlify[bot]

@Yuhta This PR replaces https://github.com/facebookincubator/velox/pull/9063 and is ready for review again. As you suggested, we now update the ScanSpec to insert/remove new columns into it instead of cloning the whole ScanSpec. Thanks a lot for reviewing!

yingsu00 avatar Oct 11 '24 22:10 yingsu00

@yingsu00 can you please rebase again? There is a conflict with other changes.

czentgr avatar Oct 24 '24 23:10 czentgr

@Yuhta Just rebased, appreciate your review again. Thanks!

yingsu00 avatar Nov 13 '24 02:11 yingsu00

cc: @liujiayi771 Would you like to take a review? Thanks.

rui-mo avatar Nov 22 '24 06:11 rui-mo

cc: @liujiayi771 Would you like to take a review? Thanks.

@liujiayi771 Can you take a look of the PR if possible? should we add something in Gluten side after this PR merged? It's requested by a Gluten customer.

FelixYBW avatar Nov 26 '24 00:11 FelixYBW

@FelixYBW Yes, Gluten needs to make some minor changes to accommodate this PR. However, Spark cannot produce equality delete files. We need to use Flink to generate Iceberg tables with equality delete files for testing. I will perform some test this week.

liujiayi771 avatar Nov 26 '24 00:11 liujiayi771

@nmahadevuni can you help resolve the conflicts with main and rebase? Thanks.

majetideepak avatar Jan 27 '25 21:01 majetideepak

This PR also has a test failure in HashJoin.dynamicFilters that goes away without these changes. It is easily reproducible locally and can be seen failing in the CI pipeline.

[ RUN      ] HashJoinTest.dynamicFilters
I0211 11:45:19.812211 1319372 Compression.cpp:629] Initialized zstd compressor with compression level 7
/root/lakehouse/velox/velox/exec/tests/HashJoinTest.cpp:4682: Failure
Expected: (getInputPositions(task, 1)) < (numRowsProbe * numSplits), actual: 3330 vs 3330
Google Test trace:
/root/lakehouse/velox/velox/exec/tests/HashJoinTest.cpp:4669: hasSpill:false
/root/lakehouse/velox/velox/exec/tests/HashJoinTest.cpp:665: Without Spill
/root/lakehouse/velox/velox/exec/tests/utils/QueryAssertions.cpp:1182: Failure
Failed
Expected 1000, got 3330
2330 extra rows, 0 missing rows
10 of extra rows:
        1
        1
        1
        1
        1
        1
        1
        1
        1
        1

0 of missing rows:

Note: DuckDB only supports timestamps of millisecond precision. If this test involves timestamp inputs, please make sure you use the right precision.
DuckDB query: SELECT t.c1 + 1 FROM t, u WHERE t.c0 = u.c0 AND t.c0 < 500
Google Test trace:
/root/lakehouse/velox/velox/exec/tests/HashJoinTest.cpp:665: Without Spill
/root/lakehouse/velox/velox/exec/tests/HashJoinTest.cpp:4805: Failure
Expected: (getInputPositions(task, 1)) < (numRowsProbe * numSplits), actual: 3330 vs 3330
Google Test trace:
/root/lakehouse/velox/velox/exec/tests/HashJoinTest.cpp:4792: hasSpill:false
/root/lakehouse/velox/velox/exec/tests/HashJoinTest.cpp:665: Without Spill
/root/lakehouse/velox/velox/exec/tests/utils/QueryAssertions.cpp:1182: Failure
Failed
Expected 830, got 2450
1620 extra rows, 0 missing rows
10 of extra rows:
        1
        1
        1
        1
        1
        1
        1
        1
        1
        1

0 of missing rows:

Note: DuckDB only supports timestamps of millisecond precision. If this test involves timestamp inputs, please make sure you use the right precision.
DuckDB query: SELECT t.c1 + 1 FROM t, u WHERE t.c0 = u.c0 AND t.c0 < 200
Google Test trace:
/root/lakehouse/velox/velox/exec/tests/HashJoinTest.cpp:665: Without Spill
/root/lakehouse/velox/velox/exec/tests/utils/QueryAssertions.cpp:1182: Failure
Failed
Expected 830, got 2450
1620 extra rows, 0 missing rows
10 of extra rows:
        1
        1
        1
        1
        1
        1
        1
        1
        1
        1

0 of missing rows:

Note: DuckDB only supports timestamps of millisecond precision. If this test involves timestamp inputs, please make sure you use the right precision.
DuckDB query: SELECT t.c1 + 1 FROM t WHERE t.c0 IN (SELECT c0 FROM u) AND t.c0 < 200
Google Test trace:
/root/lakehouse/velox/velox/exec/tests/HashJoinTest.cpp:665: Without Spill
[  FAILED  ] HashJoinTest.dynamicFilters (4442 ms)
[----------] 1 test from HashJoinTest (4443 ms total)

czentgr avatar Feb 11 '25 21:02 czentgr

Thanks @czentgr. I have fixed the issue.

nmahadevuni avatar Feb 12 '25 06:02 nmahadevuni

@Yuhta @rui-mo Will you be able to review again? Thanks!

yingsu00 avatar Feb 28 '25 04:02 yingsu00

@Yuhta @rui-mo We've updated this PR, added a separate commit with a fix for the nested subfield filter, and a new test to show the necessity of recursively adding/removing the filter on the nested column in ScanSpec. Would appreciate your review again! Thank you.

yingsu00 avatar Mar 12 '25 21:03 yingsu00

Please describe the restriction in PR description, e.g. not support hugeint, float, spark neq, timestamp

jinchengchenghh avatar Mar 13 '25 17:03 jinchengchenghh

If the user queries like select id, data, _deleted from table, how can we return the metadata column _deleted, so as _pos?

jinchengchenghh avatar Mar 14 '25 14:03 jinchengchenghh

Please describe the restriction in PR description, e.g. not support hugeint, float, spark neq, timestamp

If the user queries like select id, data, _deleted from table, how can we return the metadata column _deleted, so as _pos?

Currently Presto Java and C++ only supports $path and $data_sequence_number metadata columns. See https://prestodb.io/docs/current/connector/iceberg.html#extra-hidden-metadata-columns. We will be adding more in the future. See https://github.com/prestodb/presto/issues/24733

yingsu00 avatar Mar 16 '25 02:03 yingsu00

@rui-mo @Yuhta The PR is updated, could you please review again? Thank you!

yingsu00 avatar Mar 18 '25 23:03 yingsu00

@ethanyzhang imported this issue into IBM GitHub Enterprise

prestodb-ci avatar Apr 04 '25 07:04 prestodb-ci

@yingsu00 Do you think all the review comments has been resolved so far?

PingLiuPing avatar Aug 01 '25 15:08 PingLiuPing

@yingsu00 Do you think all the review comments has been resolved so far?

Yes we're waiting for Meta's feedback. We also added new test TestSubFieldEqualityDelete to show why we needed to tag internal (not top level) nodes in ScanSpec. We've been trying to resolve conflicts frequently but they are too frequent and took too much time. If this PR can get reviewed soon we'll rebase again.

yingsu00 avatar Sep 10 '25 20:09 yingsu00

@yingsu00 Do we still want to proceed with this change or we just start a new IcebergDataSource directly inheriting the abstract DataSource without going through HiveDataSource?

Yuhta avatar Sep 19 '25 22:09 Yuhta

@yingsu00 Do we still want to proceed with this change or we just start a new IcebergDataSource directly inheriting the abstract DataSource without going through HiveDataSource?

@Yuhta Hi Jimmy thanks for reviewing again! We are compiling the new IcebergConnector PR and it will be out in a couple of weeks. I think we can wait for that one is merged first, then we will re-submit this PR to be in the new IcebergConnector. Thank you in advance for reviewing the upcoming PRs! Your review is very much appreciated.

yingsu00 avatar Oct 01 '25 02:10 yingsu00