Add single column transformer
- ColumnReaderTransformer wraps a columnReader and applies transform functions efficiently. It avoids Object conversion unless transformation is neccessary
- Created ColumnTransformer interface and 2 implementations - DataTypeColumnTransformer and NullValueColumnTransformer
- Modify DefaultValueColumnReader to always return null so that NullValueColumnTransformer takes effect. Clients need to use ColumnReaderTransformer as the wrapper for this
Objective is for above interfaces to handle single column transformations.
Multi column transformations needs to be done separately via different interface.
This depends on https://github.com/apache/pinot/pull/17293 for new methods in the ColumnReader interface
Codecov Report
:x: Patch coverage is 89.85507% with 14 lines in your changes missing coverage. Please review.
:white_check_mark: Project coverage is 63.25%. Comparing base (95d43c0) to head (94d7228).
:warning: Report is 1 commits behind head on master.
Additional details and impacted files
@@ Coverage Diff @@
## master #17304 +/- ##
============================================
- Coverage 63.28% 63.25% -0.03%
Complexity 1474 1474
============================================
Files 3154 3158 +4
Lines 188007 188056 +49
Branches 28782 28793 +11
============================================
- Hits 118977 118957 -20
- Misses 59807 59879 +72
+ Partials 9223 9220 -3
| Flag | Coverage Δ | |
|---|---|---|
| custom-integration1 | 100.00% <ø> (ø) |
|
| integration | 100.00% <ø> (ø) |
|
| integration1 | 100.00% <ø> (ø) |
|
| integration2 | 0.00% <ø> (ø) |
|
| java-11 | 63.23% <89.85%> (-0.04%) |
:arrow_down: |
| java-21 | 63.22% <89.85%> (+<0.01%) |
:arrow_up: |
| temurin | 63.25% <89.85%> (-0.03%) |
:arrow_down: |
| unittests | 63.25% <89.85%> (-0.03%) |
:arrow_down: |
| unittests1 | 55.65% <47.10%> (-0.04%) |
:arrow_down: |
| unittests2 | 33.93% <89.13%> (-0.01%) |
:arrow_down: |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.
Please edit the PR description. It refers to the changes that were present in the previous commits.
@krishan1390, can you add a plan how all the transformers will be supported in the future, current implementation is too narrow in scope and doesn't discuss long-term usability.
Also, give a brief description on the flows that will be affected/utilise the columnTransformer and if this is configurable? Segments will eventually be loaded on servers and server can detect to reprocess a segment based on the latest schema and config so it should be safe to opt out of data type transformation and be a config in segment builder.