kyuubi icon indicating copy to clipboard operation
kyuubi copied to clipboard

Parallel processing for column-based TRowSet generation

Open bowenliang123 opened this issue 1 year ago โ€ข 1 comments

:mag: Description

Issue References ๐Ÿ”—

Subtask of #5808

This pull request fixes #

Describe Your Solution ๐Ÿ”ง

  • Support parallel processing for column-based TRowSet generation, within a fork-join pool on the engine side
  • The order of columns in TRowSet is still guaranteed by sorting the column index, which is a very light cost operation
  • Add a config to enable/disable this feature

Types of changes :bookmark:

  • [ ] Bugfix (non-breaking change which fixes an issue)
  • [ ] New feature (non-breaking change which adds functionality)
  • [ ] Breaking change (fix or feature that would cause existing functionality to change)

Test Plan ๐Ÿงช

Behavior Without This Pull Request :coffin:

Behavior With This Pull Request :tada:

I will provide a rough comparison benchmark for this feature in TRowSetGenerator.

Related Unit Tests


Checklists

๐Ÿ“ Author Self Checklist

  • [x] My code follows the style guidelines of this project
  • [x] I have performed a self-review
  • [x] I have commented my code, particularly in hard-to-understand areas
  • [ ] I have made corresponding changes to the documentation
  • [ ] My changes generate no new warnings
  • [ ] I have added tests that prove my fix is effective or that my feature works
  • [ ] New and existing unit tests pass locally with my changes
  • [x] This patch was not authored or co-authored using Generative Tooling

๐Ÿ“ Committer Pre-Merge Checklist

  • [ ] Pull request title is okay.
  • [ ] No license issues.
  • [ ] Milestone correctly set?
  • [ ] Test coverage is ok
  • [ ] Assignees are selected.
  • [ ] Minimum number of approvals
  • [ ] No changes are requested

Be nice. Be informative.

bowenliang123 avatar Dec 28 '23 10:12 bowenliang123

Codecov Report

Attention: 3 lines in your changes are missing coverage. Please review.

Comparison is base (5d59cf1) 61.24% compared to head (02120f6) 61.16%. Report is 1 commits behind head on master.

Files Patch % Lines
...gine/spark/schema/SparkArrowTRowSetGenerator.scala 0.00% 1 Missing :warning:
...apache/kyuubi/engine/result/TRowSetGenerator.scala 94.44% 0 Missing and 1 partial :warning:
...ain/scala/org/apache/kyuubi/util/ThreadUtils.scala 88.88% 1 Missing :warning:
Additional details and impacted files
@@             Coverage Diff              @@
##             master    #5927      +/-   ##
============================================
- Coverage     61.24%   61.16%   -0.08%     
  Complexity       23       23              
============================================
  Files           621      621              
  Lines         36864    36900      +36     
  Branches       5014     5016       +2     
============================================
- Hits          22576    22571       -5     
- Misses        11860    11895      +35     
- Partials       2428     2434       +6     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov-commenter avatar Dec 29 '23 07:12 codecov-commenter