nebula icon indicating copy to clipboard operation
nebula copied to clipboard

Refactoring sequential statements execution framework

Open yixinglu opened this issue 3 years ago • 0 comments

At present, nebula processes the sequential statements execution as one execution plan and needs to consider some control flow logics when optimizing the plan in optimizer. This will introduce extra complexity either in optimizer or in planner.

We should refactor the implementation of sequential statements, and separate the control flow and data flow into different handle modules in order to support storage procedure later. For example:

$var1 = GO FROM "Tim Duncan" OVER like YIELD like._dst AS dst;
$var2 = GO FROM "Tony Parker" OVER like YIELD like._dst AS dst;
GO FROM $var1.dst, $var2.dst OVER like YIELD $$.player.name;

For above query, we should run two stages optimization for two statements separated by semicolons:

  1. firstly analyze the control flow and do some parallelism optimization, such as:
Before:
  statement1 -> statement2 -> statement3

  => 

After:
  statement1
          \
            ---> statement3 
          /
  statement2

  1. secondly, do some query level optimization as same as the current optimizer implementation.

finally, we will combine the control flow plan and query level plan into one and pass it to scheduler and execution engine to run:

GetNbrs -> Project
               \
                ----> GetNbrs -> Project
               /
GetNbrs -> Project

In the future, we maybe need to support the other control syntax such as for or if, these syntax will only impact the control flow optimization module.

yixinglu avatar Oct 14 '21 07:10 yixinglu