seatunnel
seatunnel copied to clipboard
[Umbrella] SeaTunnel Transform V2 Design
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
Search before asking
- [X] I had searched in the issues and found no similar issues.
Describe the proposal
Backgroud #2678
Currently, the transform code is bound to the single engine and cannot be shared to other engine using.
I propose that we create transform-v2 module to unify transform implement, like source and sink, it is decoupled from the engine and can run on different engines.
Furthermore, we can use the translation module to integrate transform to seatunnel, flink, spark engine execute.
In order to ensure seatunnel's positioning as a data integration platform and not introduce work beyond the plan, the transform-v2 will only support UDF level data conversion, and And unsupported sql transform(because st-engine unsupported sql parse & analysis).
Motivation
- Supports running on different engines
- Supports update fields datatype & value & orders
- Supports delete\add fields
Overall Design
The Transform base process contains:
- Transform implement
- Transform translation layer
- Adapt to flink engine
- Adapt to spark engine
- Adapt to seatunnel engine
Transform

Translation layer

Task list
Translation layer
- [x] #3145
- [ ] #3267
- [ ] #3268
Transform
- [ ] Substring transform
- [ ] Convert date & time & timestamp transform
Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
transform method use function ? can support sql ?
transform method use function ? can support sql ?
@yuangjiang
Transform directly operates stream<row> on engines, currently unsupported using sql, but can achieve the same features
I suggest support send dirty data to the extra Sink.
I suggest support send dirty data to the extra Sink.
Good idea. This is another features -- data partition (selected data row will be send to specified sink)
@hailin0 Can we describe the releationship betweens transform like transform1 & transform2 and parallel, and transform3 use both transform1 & transform2 to do the filter.
@hailin0 Can we describe the releationship betweens transform like transform1 & transform2 and parallel, and transform3 use both transform1 & transform2 to do the filter.
reference https://seatunnel.apache.org/docs/concept/config#other
This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs.
This issue has been closed because it has not received response for too long time. You could reopen it if you encountered similar problems in the future.