[AMORO-3775] Add support for metric-based refresh event trigger in TableRuntimeRefreshExecutor
Why are the changes needed?
Close #3775.
Brief change log
Add support for MSE based refresh event:
- Support for calculating partition filesize mean square error based on the loaded metadata.
- Filter partitions need to be optimized based on threshold and trigger pendingInput evaluation if necessary.
How was this patch tested?
-
[x] Add some test cases that check the changes thoroughly including negative and positive cases if possible
-
[ ] Add screenshots for manual tests if appropriate
-
[x] Run test locally before making a pull request
Documentation
- Does this pull request introduce a new feature? (yes / no)
- If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)
Can we move forward with this feature now? @Jzjsnow @klion26
Can we move forward with this feature now? @Jzjsnow @klion26
Sure, I've updated the branch and added the new evaluation criteria discussed earlier (see Step 1 for details).
The current conditions for triggering pendingInput evaluation based on metrics are as follows:
Step 1: If the condition delete file=0 && avg file size > target size * ratio is met, the evaluation is considered unnecessary and will be skipped.
Step 2: Calculate detailed attributes for each partition in the table, including the sum of squared errors for file sizes. If this exceeds the file size tolerance threshold, the pendingInput requires evaluation.
Note that this update now supports MIX_ICEBERG tables, whereas previously only ICEBERG format was supported.
Please take a look when you are free. Looking forward to your feedback! @xxubai @zhoujinsong @klion26
In the latest commit, we've revamped the logic of EventBasedTrigger with key adjustments and new configurations:
The EventBasedTrigger now includes two key parameters:
- FallbackInterval: The minimum interval for executing the original tryEvaluatingPendingInput logic. It prevents false positives or missed triggers from metadata metric-driven evaluation. Defaults to -1 (disabled); enabled when set to >=0.
- MseTolerance: The tolerance threshold for partition file size MSE (default: 0). Partitions with actual MSE below this threshold are considered unnecessary for optimization.
When enabled, the flow now:
- Determine if
tryEvaluatingPendingInputneeds to run:- Check if the FallbackInterval is met to trigger
tryEvaluatingPendingInputdirectly. - Skip evaluation for empty tables.
- Skip if the condition
delete file count = 0 && avg file size > target size * ratiois satisfied (no need for pending input evaluation).
- Check if the FallbackInterval is met to trigger
- Execute tryEvaluatingPendingInput if necessary:
- Use the existing scan logic to retrieve partition file information.
- Judge if each partition requires pending status: if the MSE threshold is met, further determine the optimization type (minor/major/full).
- Update pendingInput related information.
Please take a look where you are free! @xxubai @klion26
Codecov Report
:x: Patch coverage is 0% with 101 lines in your changes missing coverage. Please review.
:white_check_mark: Project coverage is 4.77%. Comparing base (cbdc517) to head (2fb2dc7).
:warning: Report is 15 commits behind head on master.
Additional details and impacted files
@@ Coverage Diff @@
## master #3776 +/- ##
============================================
- Coverage 22.12% 4.77% -17.36%
+ Complexity 2461 471 -1990
============================================
Files 445 449 +4
Lines 40897 41048 +151
Branches 5767 5784 +17
============================================
- Hits 9050 1958 -7092
- Misses 31089 38896 +7807
+ Partials 758 194 -564
| Flag | Coverage Δ | |
|---|---|---|
| trino | 4.77% <0.00%> (-17.36%) |
:arrow_down: |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
- :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.
LGTM. Also need to fix the unit tests