docq
docq copied to clipboard
CORE: ASK logic re-write to improve control and sophisticated RAG
Situation
We are using higher-level Llama Index APIs for retrieval + query + synth.
These don't give us the customizability required. For example using fusion retrieval with chat and chat history in the LLM message collection.
In the future we likely will need to support multiple RAG algorithms for two reasons:
- iteration and improving performance over time.
- algos optimised for use cases e.g. code vs financial data vs text heavy vs expert systems.
Solution
Move to using Llama Index pipelines.
Alternative
Create a custom abstraction. This would be more time-consuming is the main downside. So we should first assess if pipelines gives us the right level abstraction and composability.