pinot icon indicating copy to clipboard operation
pinot copied to clipboard

Add IGNORE NULLS option to FIRST_VALUE and LAST_VALUE window functions

Open yashmayya opened this issue 1 year ago • 1 comments

  • The SQL standard defines a RESPECT NULLS or IGNORE NULLS option for the window functions LEAD, LAG, FIRST_VALUE, LAST_VALUE, and NTH_VALUE (although Pinot currently doesn't support this function). The default behavior is RESPECT NULLS.
  • This patch adds support for these options on the FIRST_VALUE and LAST_VALUE window functions (LEAD / LAG can be added in a subsequent patch). As the name suggests, the IGNORE NULLS option makes it so that the FIRST_VALUE and LAST_VALUE window functions compute the first and last non-null values respectively for each window frame.
  • If IGNORE NULLS is specified like LAST_VALUE(col1) IGNORE NULLS OVER (ORDER BY ts), it can effectively be used to gapfill data (see this article for example - https://learn.microsoft.com/en-us/azure/azure-sql-edge/imputing-missing-values).
  • Calcite has validation to ensure that the IGNORE NULLS / RESPECT NULLS operators are only used with window functions that they are applicable to as per standard SQL. This patch also updates the operators being registered in Pinot's operator table for LEAD / LAG since we don't currently support the null related options for those functions (this way, we fail during query planning rather than at runtime).
  • There are also some minor changes to the query plan serde here to hold the IGNORE NULLS option for a window function call.

yashmayya avatar Oct 21 '24 16:10 yashmayya

Codecov Report

Attention: Patch coverage is 88.94231% with 23 lines in your changes missing coverage. Please review.

Project coverage is 63.82%. Comparing base (59551e4) to head (4867f2c). Report is 1256 commits behind head on master.

Files with missing lines Patch % Lines
...operator/window/value/LastValueWindowFunction.java 89.15% 3 Missing and 6 partials :warning:
...perator/window/value/FirstValueWindowFunction.java 90.80% 2 Missing and 6 partials :warning:
...apache/pinot/common/collections/DualValueList.java 80.00% 1 Missing and 1 partial :warning:
...ache/pinot/calcite/sql/fun/PinotOperatorTable.java 75.00% 2 Missing :warning:
.../query/planner/logical/PlanNodeToRelConverter.java 0.00% 1 Missing :warning:
...ime/operator/window/value/ValueWindowFunction.java 80.00% 0 Missing and 1 partial :warning:
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #14264      +/-   ##
============================================
+ Coverage     61.75%   63.82%   +2.07%     
- Complexity      207     1556    +1349     
============================================
  Files          2436     2660     +224     
  Lines        133233   145674   +12441     
  Branches      20636    22287    +1651     
============================================
+ Hits          82274    92981   +10707     
- Misses        44911    45822     +911     
- Partials       6048     6871     +823     
Flag Coverage Δ
custom-integration1 100.00% <ø> (+99.99%) :arrow_up:
integration 100.00% <ø> (+99.99%) :arrow_up:
integration1 100.00% <ø> (+99.99%) :arrow_up:
integration2 0.00% <ø> (ø)
java-11 63.80% <88.94%> (+2.09%) :arrow_up:
java-21 63.65% <88.94%> (+2.02%) :arrow_up:
skip-bytebuffers-false 63.82% <88.94%> (+2.07%) :arrow_up:
skip-bytebuffers-true 63.63% <88.94%> (+35.90%) :arrow_up:
temurin 63.82% <88.94%> (+2.07%) :arrow_up:
unittests 63.82% <88.94%> (+2.07%) :arrow_up:
unittests1 55.44% <88.94%> (+8.55%) :arrow_up:
unittests2 34.25% <2.88%> (+6.52%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov-commenter avatar Oct 21 '24 17:10 codecov-commenter