seatunnel icon indicating copy to clipboard operation
seatunnel copied to clipboard

[ST-Engine][Design] The Design of LogicalPlan to PhysicalPlan

Open Hisoka-X opened this issue 2 years ago • 2 comments

Search before asking

  • [X] I had searched in the feature and found no similar feature requirement.

Description

SeaTunnel engine will receive the logical plan sent by the client, and the engine needs to convert it into a physical plan that can be directly executed. Therefore, it is necessary to process the logical execution plan and generate a physical plan through conversion. The specific process is as follows:

  1. Logical Plan image Received the logical plan, we need to remove redundant Actions, and verifying the Schema(Transform2 and Transform 5 should be same)
  2. Execution Plan image While converting to an execution plan:
  • Transforms need to be merged, and the basis for merging is whether the data will be split after the Transform.
  • Convert Shuffle Action to Queue
  • Add Enumerator and Committer
  1. Pipeline image Pipeline is currently the same as ExecutionPlan. Because we didn't add the Cache module, it will be different later
  2. Physical Plan image We will split the Pipeline into separate executable tasks according to the degree of parallelism After this, can send task to task execution service. Then task can run normally.

Usage Scenario

No response

Related issues

No response

Are you willing to submit a PR?

  • [x] Yes I am willing to submit a PR!

Code of Conduct

Hisoka-X avatar Jul 26 '22 04:07 Hisoka-X

hi, Excuse me,Let me ask you a question, Why is the data queue designed to decouple source and sink? Is the data cached in the queue when the source fails?

2013650523 avatar Aug 10 '22 07:08 2013650523

hi, Excuse me,Let me ask you a question, Why is the data queue designed to decouple source and sink? Is the data cached in the queue when the source fails?

  1. The Queue will create when use shuffle transform (at now call partition transform). Used for data shuffle and change parallelism
  2. The cache feature you mentioned will be added later

Hisoka-X avatar Aug 10 '22 07:08 Hisoka-X