noisepage
noisepage copied to clipboard
Index Scan + Index Join Limit
Index Scan + Index Join Limit
Description
Limit clauses are currently not propagated to the IndexScanPlanNode
nor the IndexJoinPlanNode
and as a result, the execution engine can't take advantage of pushing down the limit during operation. Instead, this is done in-post, with a LimitPlanNode
doing so after an index scan is completed.
This PR adds functionality for the limit value to be pushed down to index scans, and is used in TPC-C. Limits values will be pushed down to their child LogicalGet
via transformation rule and converted to values in the PhysicalIndexScan
which are then set in the IndexScanPlanNode
. The PR also moves the OrderByOrderingType
from the optimizer to the catalog as a precursor to further changes to involve the sort direction of columns in creating/scanning an index.
The final implementation effort is to introduce optional properties to push down ORDER BY sort properties into index scans whenever possible. There are a few alternative possible implementations:
- Converting properties to include a boolean flag as to whether they are optional or not.
- Converting property sets to include a boolean flag for each property in the set to identify whether they are optional or not. We choose method 1 in this implementation, though the alternative is not difficult to switch to 2 (see PR #1031).
Additionally, a guide to optimizer development is included for future developers.
Further work
A description of further work is included in Issue #1421
Hey @thepinetree I'm just going to consolidate everything we discussed on slack here so that you have a single place to go to when you return to this PR, let me know if I'm missing anything. I'm not sure which of these belong in this PR, or should be done it later PRs, I guess that's up to you/Will/whoever makes those decisions.
- We need someway of pushing down limits to
OrderBy
nodes that are generated through thePropertyEnforcer
- We need to remove the sort information from
Limit
nodes, and remove theOrderBy
generation for limit's in thePlanGenerator
. - Since the
InputColumnDeriver
uses anunordered_set
to store columns, the order of the columns returned is essentially "random". This can lead to unnecessaryProjectionPlanNode
s being generated in the middle of the query plan to reorder columns.