llama-stack
llama-stack copied to clipboard
Provide better control over the RAG ingestion stages (conversion, chunking, embedding, storing)
🚀 Describe the new functionality needed
As of now, the RAG ingestion documents chunks the documents using a trivial algorithm of overlapping chunks and converts PDFs (and PDFs only) using pypdf. The entire ingestion process should be made more general, flexible and user-controllable by introducing the respective configuration settings - similarly to the way the embedding model can be specified today via config.
💡 Why is this needed? What if we don't build it?
Having a higher degree of control over the ingestion process in an enabler to a wide range of customer use cases. To mention a few examples:
- Structured chunking aware of the document format (e.g., .json, .md);
- Controllable chunking granularity;
- Defining document converters on a per format basis;
- Support for sparse DBs by omitting the embedding stage.
Other thoughts
No response