llama-stack Provide better control over the RAG ingestion stages (conversion, chunking, embedding, storing)

Provide better control over the RAG ingestion stages (conversion, chunking, embedding, storing)

Open ilya-kolchinsky opened this issue 8 months ago • 8 comments

🚀 Describe the new functionality needed

As of now, the RAG ingestion documents chunks the documents using a trivial algorithm of overlapping chunks and converts PDFs (and PDFs only) using pypdf. The entire ingestion process should be made more general, flexible and user-controllable by introducing the respective configuration settings - similarly to the way the embedding model can be specified today via config.

💡 Why is this needed? What if we don't build it?

Having a higher degree of control over the ingestion process in an enabler to a wide range of customer use cases. To mention a few examples:

Structured chunking aware of the document format (e.g., .json, .md);
Controllable chunking granularity;
Defining document converters on a per format basis;
Support for sparse DBs by omitting the embedding stage.

Other thoughts

No response

Feb 12 '25 10:02 ilya-kolchinsky

llama-stack llama-stack copied to clipboard

Provide better control over the RAG ingestion stages (conversion, chunking, embedding, storing)

🚀 Describe the new functionality needed

💡 Why is this needed? What if we don't build it?

Other thoughts

llama-stack
llama-stack copied to clipboard