nebula
nebula copied to clipboard
Refactoring sequential statements execution framework
At present, nebula processes the sequential statements execution as one execution plan and needs to consider some control flow logics when optimizing the plan in optimizer. This will introduce extra complexity either in optimizer or in planner.
We should refactor the implementation of sequential statements, and separate the control flow and data flow into different handle modules in order to support storage procedure later. For example:
$var1 = GO FROM "Tim Duncan" OVER like YIELD like._dst AS dst;
$var2 = GO FROM "Tony Parker" OVER like YIELD like._dst AS dst;
GO FROM $var1.dst, $var2.dst OVER like YIELD $$.player.name;
For above query, we should run two stages optimization for two statements separated by semicolons:
- firstly analyze the control flow and do some parallelism optimization, such as:
Before:
statement1 -> statement2 -> statement3
=>
After:
statement1
\
---> statement3
/
statement2
- secondly, do some query level optimization as same as the current optimizer implementation.
finally, we will combine the control flow plan and query level plan into one and pass it to scheduler and execution engine to run:
GetNbrs -> Project
\
----> GetNbrs -> Project
/
GetNbrs -> Project
In the future, we maybe need to support the other control syntax such as for
or if
, these syntax will only impact the control flow optimization module.