self-reflective-RAG
self-reflective-RAG copied to clipboard
Exploring SOTA Advanced RAG techniques: This project implements a self reflective RAG, seamlessly integrating multiple knowledge sources (website, SQL, PDFs) while meticulously aligning with business...
Self Reflective RAG for a Business
⚡ Implementing state-of-the-art advanced RAG technique: Self Reflective RAG 💪
Project Overview
This project implements a self reflective RAG, seamlessly integrating multiple knowledge sources (website, SQL, PDFs) while meticulously aligning with business requirements.
-
What is self-reflective RAG: A self-reflective RAG refers to an adaptive and self-improving system that combines information retrieval and language generation processes to provide more accurate and context-specific responses. Self-Reflective RAG involves a feedback loop where the model evaluates and reflects on its own outputs. This reflection helps in identifying and correcting errors or improving responses.
-
Project use case: I always wanted to create a RAG system which involves multiple knowledge sources, specifically for a business. The project is built for businesses with integration of data sources included but not limited to unstructured: PDFs and Documents, structured: SQL, NoSQL, Graph databases, CSV and more, semi-structured: websites, APIs, and other platforms along with web searching capabilities.
-
Tech-stack:
| Component | Technology | Description |
|---|---|---|
| RAG | LangGraph | Framework used for building the RAG model |
| Output Tracing | LangSmith | Tool used for tracing and evaluating model outputs |
| Indexing | Pinecone | Service used for indexing and managing the knowledge base |
| Web Searching | Tavily | Tool used for retrieving information from the web |
| LLM | OpenAI | Provides the language model for text generation |
| Chat Interface | Gradio | Interface for interacting with the RAG model |
| SQL Database | SQLite | Database used for querying business data and for storing RAG's memory |
How it works?
- First the user asks a question.
- The query is analyzed by the router and the RAG is directed to the relevant knowledge source. Available routes: i. Vector store (pdf, website), ii. SQL, iii. Web search, iii. Fallback conversational LLM
- For the vector store route, the
Retriever Nodefetches relevant documents which is fed to theGrader Nodefor evaluation (whether it is relevant and useful or not). If the documents are not relevant then theQuery Translation Nodere-writes the question and theRetriever Nodeis called again in order to get better documents. - If the documents are relevant and useful, the
Generate Nodeis called which generates a response to the asked question. - The generated response is then checked for hallucination using the
Hallucination Grader. This grader checks whether the response is grounded or not. If not then theGenerate Nodeis called again, otherwise next step is taken. - Finally the
Answer Grader Nodeis responsible to check whether the generated answer is addressing the question or not. If not then the response generation loop is called again. Otherwise, the response is provided to the user. - For other routes, different tools, agents and chains are developed which are called based on the route. Please refer to the image below to better understanding.
How to use?
Step 1: Fork and Clone the repository
- Fork the repo (or directly clone if you don't want to update with your own code).
- Setup a folder and clone using:
git clone https://github.com/Taha0229/self-reflective-RAG.git .orgit clone <your/link/to/repo> .if you have forked.
Step 2: Setup Virtual Environment [Not required for colab/kaggle]
- Create a virtual environment using:
conda create --name self-reflective-rag python=3.10 -y. I have used conda, you can also use your preferred tool to create a virtual environment. If this is your first time with virtual environment, then install and setup conda first.
Step 3: Setup Environment Variables
- Setup environment variables: since we are gonna use multiple APIs (OpenAI, LangSmith (optional but recommended), Tavily and Pinecone), it is recommended to use environment variables otherwise you can hard code your API keys. Create a file named as
.env. Generate and paste API keys as follows:
OPENAI_API_KEY = "<your-openai-api-key>"
TAVILY_API_KEY = "<your-tavily-api-key>"
PINECONE_API_KEY = "<your-pinecone-api-key>"
LANGCHAIN_API_KEY = "<your-langchain-api-key>"
LANGCHAIN_TRACING_V2 = "true"
LANGCHAIN_ENDPOINT= "https://api.smith.langchain.com"
LANGCHAIN_PROJECT= "<your-project-name>"
Step 4: Follow the Instruction and Run langgraph_self_reflective_rag.ipynb
-
Select a kernel for the jupyter notebook then run all the cells. Or else you can go through each cell and customize as per your needs. I have provided markdown and comments for each and every cell, doc strings are also present for all the classes and function/methods.
-
Cells' structure:
- Setup Environment: 4 cells
- Setup Pinecone Index: 7 cells
- Setup Chains: 18 cells
- Setup Graph: 15 cells
- Setup Chatting Interface: 6 cells
Implementation
-
Self Reflective RAG is an advanced and state-of-the-art strategy that unites (1) query analysis with (2) active / self-corrective RAG.
-
The implementation is inspired by this paper.
-
The architecture involves following data sources/ Routing:
- URL and pdf for the vector store
- SQL database
- Web Search using
- Fallback conversational LLM
- The Self-Reflection loop includes:
- Grading retrieved documents -> re-retrieve or change the data source if document is not relevant
- Hallucination checker -> re-generates the response if hallucination is found
- Answer checker -> checks whether the generated answer addresses the user query or not otherwise generates again
- Previously explored Advanced RAG techniques and research papers:
i. Query Translation:
ii. Indexing:
iii. Other RAG Architectures:
- CRAG
GitHub Commit message format
Feat– feature
Fix– bug fixes
Docs– changes to the documentation like README
Style– style or formatting change
Perf – improves code performance
Test– test a feature
Example: git commit -m "Docs: add readme" or git commit -m "Feat: add chatting interface"