calcite icon indicating copy to clipboard operation
calcite copied to clipboard

[CALCITE-5314] Prune empty parts of a query by exploiting stats/metadata

Open HanumathRao opened this issue 2 years ago • 0 comments

The following changes are made to implement an optimization for pruning out the sub trees when a base-table is empty. The detection of a table being empty is done using MaxRow stat.

I have not used existing stats like rowcount etc for triggering the optimization as they are estimates and can be stale. I also thought about introducing a new stat like exactRowCount but it seems like an overkill (just for base table it is valid and it soon will be an estimate after base table for any other node in the tree). After due consideration, it seems like MaxRow stat fits well for this scenario.

  1. It is not a default stat and it needs to be supplied for a table by overloading the MaxRowHandler.
  2. Default value of the MaxRow is large value (Infinity) and hence the optimization doesn't trigger by default.

EmptyTableOptimizationConfig change adds a new rule to transform the base table to empty values node when maxRowCount is zero. All the existing PruneEmptyRules will do the necessary optimizations to prune out the complex sub-tree once the emptyValues node is created.

Please review the changes.

HanumathRao avatar Oct 10 '22 03:10 HanumathRao