[FEAT]: Add Optional Small-to-Big Retrieval
What would you like to see?
Apparently, smaller chunk sizes improve retrieval quality, but larger chunk sizes improve generation quality: Advanced RAG 01: Small-to-Big Retrieval.
If the current embediing process stores the relative chunk ids per document, then when chunk i is retrieved, we can prepend chunks [i-2, i-1] and append chunks [i+1, i+2] and pass on that big combined text to the generation step. This would have both benefits: smaller chunks for retrieval and larger chunks for generation. Naturally, we need to make sure that any i+/-n chunk exists before adding null.
My idea is to simplify the implementation by just adding optional prepend/append integers that would default to 0, but could be changed by the user in the settings.
The alternative is to do full Parent Document Retriever, but this is a much bigger task IMHO.
Parent Document Retriever would be a nice option for documents (like the pinning option). Due to mixes of documents in which some are too big to be retrieved as parent.