Add a framework to validate each of the Ingestion Transformation functions
This PR adds a validation framework for Pinot transform functions used in ingestion configs. It provides validation hooks for datatype checks in the TransformFunction interface that individual functions can implement to validate their configurations during table creation.
We can include validationMode in the transform function specification:
- LEGACY Mode (Default) Purpose: Backward compatibility - allows all existing type conversions Behavior: No validation, accepts everything (current Pinot behavior) Use Case: Existing tables that shouldn't break
- LENIENT Mode (Recommended) Purpose: Safe type conversions allowed Behavior: Allows safe conversions like INT→LONG, FLOAT→DOUBLE, but blocks unsafe ones like STRING→INT Use Case: New tables that want some safety but flexibility
- STRICT Mode Purpose: Maximum type safety Behavior: No automatic type conversions, exact type matching required Use Case: Critical tables where type safety is paramount
{
"columnName": "processed_age",
"transformFunction": "CAST(age_string AS INT)",
"validationMode": "STRICT"
}
Codecov Report
:x: Patch coverage is 52.63158% with 9 lines in your changes missing coverage. Please review.
:white_check_mark: Project coverage is 63.45%. Comparing base (1b6866d) to head (3b4410f).
:warning: Report is 237 commits behind head on master.
| Files with missing lines | Patch % | Lines |
|---|---|---|
| ...operator/transform/function/TransformFunction.java | 0.00% | 7 Missing :warning: |
| ...ot/spi/config/table/ingestion/IngestionConfig.java | 33.33% | 2 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## master #17039 +/- ##
============================================
+ Coverage 56.42% 63.45% +7.03%
- Complexity 702 1419 +717
============================================
Files 2406 3084 +678
Lines 133681 182157 +48476
Branches 21260 27953 +6693
============================================
+ Hits 75424 115596 +40172
- Misses 51983 57647 +5664
- Partials 6274 8914 +2640
| Flag | Coverage Δ | |
|---|---|---|
| custom-integration1 | 100.00% <ø> (?) |
|
| integration | 100.00% <ø> (+100.00%) |
:arrow_up: |
| integration1 | 100.00% <ø> (?) |
|
| integration2 | 0.00% <ø> (ø) |
|
| java-11 | 63.42% <52.63%> (+7.04%) |
:arrow_up: |
| java-21 | 63.42% <52.63%> (+7.02%) |
:arrow_up: |
| temurin | 63.45% <52.63%> (+7.03%) |
:arrow_up: |
| unittests | 63.45% <52.63%> (+7.03%) |
:arrow_up: |
| unittests1 | 56.30% <47.36%> (-0.13%) |
:arrow_down: |
| unittests2 | 33.61% <52.63%> (?) |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.