velox
velox copied to clipboard
Support reading Iceberg split with equality deletes
This PR introduces EqualityDeleteFileReader, which is used to read Iceberg splits with equality delete files.
Deploy Preview for meta-velox canceled.
| Name | Link |
|---|---|
| Latest commit | 8b0ed7ff524718d4b5e96c83142ddaef1253a6fa |
| Latest deploy log | https://app.netlify.com/projects/meta-velox/deploys/68884d67fb888500089e9bf9 |
@Yuhta This PR replaces https://github.com/facebookincubator/velox/pull/9063 and is ready for review again. As you suggested, we now update the ScanSpec to insert/remove new columns into it instead of cloning the whole ScanSpec. Thanks a lot for reviewing!
@yingsu00 can you please rebase again? There is a conflict with other changes.
@Yuhta Just rebased, appreciate your review again. Thanks!
cc: @liujiayi771 Would you like to take a review? Thanks.
cc: @liujiayi771 Would you like to take a review? Thanks.
@liujiayi771 Can you take a look of the PR if possible? should we add something in Gluten side after this PR merged? It's requested by a Gluten customer.
@FelixYBW Yes, Gluten needs to make some minor changes to accommodate this PR. However, Spark cannot produce equality delete files. We need to use Flink to generate Iceberg tables with equality delete files for testing. I will perform some test this week.
@nmahadevuni can you help resolve the conflicts with main and rebase? Thanks.
This PR also has a test failure in HashJoin.dynamicFilters that goes away without these changes. It is easily reproducible locally and can be seen failing in the CI pipeline.
[ RUN ] HashJoinTest.dynamicFilters
I0211 11:45:19.812211 1319372 Compression.cpp:629] Initialized zstd compressor with compression level 7
/root/lakehouse/velox/velox/exec/tests/HashJoinTest.cpp:4682: Failure
Expected: (getInputPositions(task, 1)) < (numRowsProbe * numSplits), actual: 3330 vs 3330
Google Test trace:
/root/lakehouse/velox/velox/exec/tests/HashJoinTest.cpp:4669: hasSpill:false
/root/lakehouse/velox/velox/exec/tests/HashJoinTest.cpp:665: Without Spill
/root/lakehouse/velox/velox/exec/tests/utils/QueryAssertions.cpp:1182: Failure
Failed
Expected 1000, got 3330
2330 extra rows, 0 missing rows
10 of extra rows:
1
1
1
1
1
1
1
1
1
1
0 of missing rows:
Note: DuckDB only supports timestamps of millisecond precision. If this test involves timestamp inputs, please make sure you use the right precision.
DuckDB query: SELECT t.c1 + 1 FROM t, u WHERE t.c0 = u.c0 AND t.c0 < 500
Google Test trace:
/root/lakehouse/velox/velox/exec/tests/HashJoinTest.cpp:665: Without Spill
/root/lakehouse/velox/velox/exec/tests/HashJoinTest.cpp:4805: Failure
Expected: (getInputPositions(task, 1)) < (numRowsProbe * numSplits), actual: 3330 vs 3330
Google Test trace:
/root/lakehouse/velox/velox/exec/tests/HashJoinTest.cpp:4792: hasSpill:false
/root/lakehouse/velox/velox/exec/tests/HashJoinTest.cpp:665: Without Spill
/root/lakehouse/velox/velox/exec/tests/utils/QueryAssertions.cpp:1182: Failure
Failed
Expected 830, got 2450
1620 extra rows, 0 missing rows
10 of extra rows:
1
1
1
1
1
1
1
1
1
1
0 of missing rows:
Note: DuckDB only supports timestamps of millisecond precision. If this test involves timestamp inputs, please make sure you use the right precision.
DuckDB query: SELECT t.c1 + 1 FROM t, u WHERE t.c0 = u.c0 AND t.c0 < 200
Google Test trace:
/root/lakehouse/velox/velox/exec/tests/HashJoinTest.cpp:665: Without Spill
/root/lakehouse/velox/velox/exec/tests/utils/QueryAssertions.cpp:1182: Failure
Failed
Expected 830, got 2450
1620 extra rows, 0 missing rows
10 of extra rows:
1
1
1
1
1
1
1
1
1
1
0 of missing rows:
Note: DuckDB only supports timestamps of millisecond precision. If this test involves timestamp inputs, please make sure you use the right precision.
DuckDB query: SELECT t.c1 + 1 FROM t WHERE t.c0 IN (SELECT c0 FROM u) AND t.c0 < 200
Google Test trace:
/root/lakehouse/velox/velox/exec/tests/HashJoinTest.cpp:665: Without Spill
[ FAILED ] HashJoinTest.dynamicFilters (4442 ms)
[----------] 1 test from HashJoinTest (4443 ms total)
Thanks @czentgr. I have fixed the issue.
@Yuhta @rui-mo Will you be able to review again? Thanks!
@Yuhta @rui-mo We've updated this PR, added a separate commit with a fix for the nested subfield filter, and a new test to show the necessity of recursively adding/removing the filter on the nested column in ScanSpec. Would appreciate your review again! Thank you.
Please describe the restriction in PR description, e.g. not support hugeint, float, spark neq, timestamp
If the user queries like select id, data, _deleted from table, how can we return the metadata column _deleted, so as _pos?
Please describe the restriction in PR description, e.g. not support hugeint, float, spark neq, timestamp
If the user queries like
select id, data, _deleted from table, how can we return the metadata column _deleted, so as _pos?
Currently Presto Java and C++ only supports $path and $data_sequence_number metadata columns. See https://prestodb.io/docs/current/connector/iceberg.html#extra-hidden-metadata-columns. We will be adding more in the future. See https://github.com/prestodb/presto/issues/24733
@rui-mo @Yuhta The PR is updated, could you please review again? Thank you!
@ethanyzhang imported this issue into IBM GitHub Enterprise
@yingsu00 Do you think all the review comments has been resolved so far?
@yingsu00 Do you think all the review comments has been resolved so far?
Yes we're waiting for Meta's feedback. We also added new test TestSubFieldEqualityDelete to show why we needed to tag internal (not top level) nodes in ScanSpec. We've been trying to resolve conflicts frequently but they are too frequent and took too much time. If this PR can get reviewed soon we'll rebase again.
@yingsu00 Do we still want to proceed with this change or we just start a new IcebergDataSource directly inheriting the abstract DataSource without going through HiveDataSource?
@yingsu00 Do we still want to proceed with this change or we just start a new
IcebergDataSourcedirectly inheriting the abstractDataSourcewithout going throughHiveDataSource?
@Yuhta Hi Jimmy thanks for reviewing again! We are compiling the new IcebergConnector PR and it will be out in a couple of weeks. I think we can wait for that one is merged first, then we will re-submit this PR to be in the new IcebergConnector. Thank you in advance for reviewing the upcoming PRs! Your review is very much appreciated.