calcite-kudu
calcite-kudu copied to clipboard
Use `KuduTableStatistics` to determine row counts
Summary: This change includes a bunch of changes happening.
- Remove
KuduLimitRel
- with #18KuduRortRel
now gets the fetch and offset. It no longer is derived fromEnumerableLimit
andEnumerableLimit
is just as efficient asKuduLimit
-
KuduSortRel
andKuduProjectRel
no longer produce unstable row estimates. Prior to this change, they producedDouble.MIN_VALUE
which resulted inException
s being thrown during planning process -
TableType
now has a method for it's estimated row counts -
CalciteKuduTable
now attempts to getrow counts
directly from Kudu cluster and if that fails uses the estimates fromTableType
This results in a row count estimation that no longer depends on
TableType
and can be applied more generally.
Contributing to Twilio
All third-party contributors acknowledge that any contributions they provide will be made under the same open-source license that the open-source project is provided under.
- [X] I acknowledge that all my contributions will be made under the project's license.
I still think I want to take a pass at updating all the computeSelfCost
implementations to better represent what they are doing.
- Projections would lower the number of rows proportional to the number of columns selected
- Sort call the
super
then set the cpu count to 0 - Filter would a.) set row count proportional to the number of partitions being scanned and b.) reduce the cost some constant amount per additional filter
- Nested Join should adopt the costing provided by
EnumerableBatchNestedJoin
and stop fixing onActorDimension
table.