[Feature] Connector prepare for RAG
Search before asking
- [x] I had searched in the feature and found no similar feature requirement.
Description
As a multimodal data integration tool, we hope that SeaTunnel can support parsing complex file types, converting their contents into structured file streams, and ultimately writing them into a vector library through embedding. This issue tracks related tasks.
For chunking please refer Please refer https://docs.dify.ai/en/guides/knowledge-base/create-knowledge-and-upload-documents/chunking-and-cleaning-text and https://docs.llamaindex.ai/en/stable/examples/node_parsers/semantic_chunking/
Usage Scenario
No response
Related issues
No response
Are you willing to submit a PR?
- [ ] Yes I am willing to submit a PR!
Code of Conduct
- [x] I agree to follow this project's Code of Conduct
@Hisoka-X
Before working on this, I think it’s important to have a discussion about abstraction first.
Is it okay for the person who originally created this to just go ahead and handle the abstraction work as well?
@Hisoka-X
Is it alright if I collaborate with @joonseolee on the abstraction task and also take on tickets 1 through 3 together?
Is it alright if I collaborate with @joonseolee on the abstraction task and also take on tickets 1 through 3 together?
Sure! Thanks @joonseolee @iinow