kyuubi
kyuubi copied to clipboard
enable MaxScanStrategy when accessing iceberg datasource
:mag: Description
Issue References ๐
Now, MaxScanStrategy can be adopted to limit max scan file size/max scan partitions in some datasources, such as Hive. Hopefully we can enhance MaxScanStrategy to include support for the iceberg datasource.
Describe Your Solution ๐ง
get the statistics about files and partitions scanned from iceberg datasourcev2 API
Types of changes :bookmark:
- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
Test Plan ๐งช
Behavior Without This Pull Request :coffin:
Behavior With This Pull Request :tada:
Related Unit Tests
Checklists
๐ Author Self Checklist
- [ ] My code follows the style guidelines of this project
- [ ] I have performed a self-review
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] I have made corresponding changes to the documentation
- [ ] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my feature works
- [ ] New and existing unit tests pass locally with my changes
- [ ] This patch was not authored or co-authored using Generative Tooling
๐ Committer Pre-Merge Checklist
- [ ] Pull request title is okay.
- [ ] No license issues.
- [ ] Milestone correctly set?
- [ ] Test coverage is ok
- [ ] Assignees are selected.
- [ ] Minimum number of approvals
- [ ] No changes are requested
Be nice. Be informative.
Please make sure that the Kyuubi Spark extension also works well on iceberg-free Spark runtime.
Please make sure that the Kyuubi Spark extension also works well on iceberg-free Spark runtime.
good point. Thanks
Please make sure that the Kyuubi Spark extension also works well on iceberg-free Spark runtime.
Fixed. Plz review again.
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 58.40%. Comparing base (
67f099a
) to head (3c5b0c2
). Report is 23 commits behind head on master.
Additional details and impacted files
@@ Coverage Diff @@
## master #5852 +/- ##
============================================
- Coverage 58.58% 58.40% -0.19%
Complexity 24 24
============================================
Files 649 651 +2
Lines 39379 39513 +134
Branches 5415 5441 +26
============================================
+ Hits 23070 23076 +6
- Misses 13841 13955 +114
- Partials 2468 2482 +14
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
@zhaohehuhu Could you add a unit test?
@zhaohehuhu Could you add a unit test?
Sure. I will add it. Thanks!
Thanks @wForget @pan3793
disable the rule that checks the maxPartitions for dsv2 @wForget
Thanks, merged to master/1.9