Support text splitter transform (Chunking).
@Hisoka-X
Can I add a TextTransform class to the seatunnel-transforms-v2/src/main/java/org/apache/seatunnel/transforms/text package for this feature?
Can I add a
TextTransformclass to the seatunnel-transforms-v2/src/main/java/org/apache/seatunnel/transforms/text package for this feature?
That's right!
@Hisoka-X Additionally, Is it acceptable to extend MultipleFieldOutputTransform, implement protected Column[] getOutputColumns() with only one "chunk" column, and receive the chunk_size option via the constructor for this text chunking transform?
Is it acceptable to extend MultipleFieldOutputTransform, implement protected Column[] getOutputColumns()
This is a normal and right way to extend new transform.
only one "chunk" column, and receive the chunk_size option via the constructor
Could you use https://docs.dify.ai/en/guides/knowledge-base/create-knowledge-and-upload-documents/chunking-and-cleaning-text#general-mode as a guide? We should support general mode in our first version.
@Hisoka-X Okay thanks, Can I take this issue?
@iinow Is there any progress ?
@davidzollo I’m afraid I won’t be able to take on the task as I initially planned. I’m really sorry for the inconvenience
@davidzollo I’m afraid I won’t be able to take on the task as I initially planned. I’m really sorry for the inconvenience
Don't worry, Delays do sometimes happen. Can you complete it some days later?