Self Reflective RAG for a Business

⚡ Implementing state-of-the-art advanced RAG technique: Self Reflective RAG 💪

Project Overview

This project implements a self reflective RAG, seamlessly integrating multiple knowledge sources (website, SQL, PDFs) while meticulously aligning with business requirements.

What is self-reflective RAG: A self-reflective RAG refers to an adaptive and self-improving system that combines information retrieval and language generation processes to provide more accurate and context-specific responses. Self-Reflective RAG involves a feedback loop where the model evaluates and reflects on its own outputs. This reflection helps in identifying and correcting errors or improving responses.
Project use case: I always wanted to create a RAG system which involves multiple knowledge sources, specifically for a business. The project is built for businesses with integration of data sources included but not limited to unstructured: PDFs and Documents, structured: SQL, NoSQL, Graph databases, CSV and more, semi-structured: websites, APIs, and other platforms along with web searching capabilities.
Tech-stack:

Component	Technology	Description
RAG	LangGraph	Framework used for building the RAG model
Output Tracing	LangSmith	Tool used for tracing and evaluating model outputs
Indexing	Pinecone	Service used for indexing and managing the knowledge base
Web Searching	Tavily	Tool used for retrieving information from the web
LLM	OpenAI	Provides the language model for text generation
Chat Interface	Gradio	Interface for interacting with the RAG model
SQL Database	SQLite	Database used for querying business data and for storing RAG's memory

How it works?

First the user asks a question.
The query is analyzed by the router and the RAG is directed to the relevant knowledge source. Available routes: i. Vector store (pdf, website), ii. SQL, iii. Web search, iii. Fallback conversational LLM
For the vector store route, the Retriever Node fetches relevant documents which is fed to the Grader Node for evaluation (whether it is relevant and useful or not). If the documents are not relevant then the Query Translation Node re-writes the question and the Retriever Node is called again in order to get better documents.
If the documents are relevant and useful, the Generate Node is called which generates a response to the asked question.
The generated response is then checked for hallucination using the Hallucination Grader. This grader checks whether the response is grounded or not. If not then the Generate Node is called again, otherwise next step is taken.
Finally the Answer Grader Node is responsible to check whether the generated answer is addressing the question or not. If not then the response generation loop is called again. Otherwise, the response is provided to the user.
For other routes, different tools, agents and chains are developed which are called based on the route. Please refer to the image below to better understanding.

How to use?

Step 1: Fork and Clone the repository

Fork the repo (or directly clone if you don't want to update with your own code).
Setup a folder and clone using: git clone https://github.com/Taha0229/self-reflective-RAG.git . or git clone <your/link/to/repo> . if you have forked.

Step 2: Setup Virtual Environment [Not required for colab/kaggle]

Create a virtual environment using: conda create --name self-reflective-rag python=3.10 -y. I have used conda, you can also use your preferred tool to create a virtual environment. If this is your first time with virtual environment, then install and setup conda first.

Step 3: Setup Environment Variables

Setup environment variables: since we are gonna use multiple APIs (OpenAI, LangSmith (optional but recommended), Tavily and Pinecone), it is recommended to use environment variables otherwise you can hard code your API keys. Create a file named as .env. Generate and paste API keys as follows:

OPENAI_API_KEY = "<your-openai-api-key>"
TAVILY_API_KEY = "<your-tavily-api-key>"
PINECONE_API_KEY = "<your-pinecone-api-key>"
LANGCHAIN_API_KEY = "<your-langchain-api-key>"
LANGCHAIN_TRACING_V2 = "true"
LANGCHAIN_ENDPOINT= "https://api.smith.langchain.com"
LANGCHAIN_PROJECT= "<your-project-name>"

Step 4: Follow the Instruction and Run `langgraph_self_reflective_rag.ipynb`

Select a kernel for the jupyter notebook then run all the cells. Or else you can go through each cell and customize as per your needs. I have provided markdown and comments for each and every cell, doc strings are also present for all the classes and function/methods.
Cells' structure:

Setup Environment: 4 cells
Setup Pinecone Index: 7 cells
Setup Chains: 18 cells
Setup Graph: 15 cells
Setup Chatting Interface: 6 cells

Implementation

Self Reflective RAG is an advanced and state-of-the-art strategy that unites (1) query analysis with (2) active / self-corrective RAG.
The implementation is inspired by this paper.
The architecture involves following data sources/ Routing:

URL and pdf for the vector store
SQL database
Web Search using
Fallback conversational LLM

The Self-Reflection loop includes:

Grading retrieved documents -> re-retrieve or change the data source if document is not relevant
Hallucination checker -> re-generates the response if hallucination is found
Answer checker -> checks whether the generated answer addresses the user query or not otherwise generates again

Previously explored Advanced RAG techniques and research papers:

i. Query Translation:

Multi-query: paper
RAG-Fusion: paper
Decomposition: paper1 paper2
Step back: paper
HyDE: paper

ii. Indexing:

Multi-representation Indexing: paper
RAPTOR: paper
ColBERT: blog2 blog2

iii. Other RAG Architectures:

CRAG

GitHub Commit message format

Feat– feature

Fix– bug fixes

Docs– changes to the documentation like README

Style– style or formatting change

Perf – improves code performance

Test– test a feature

Example: git commit -m "Docs: add readme" or git commit -m "Feat: add chatting interface"

self-reflective-RAG
self-reflective-RAG copied to clipboard

Metadata

Self Reflective RAG for a Business

Project Overview

How it works?

How to use?

Step 1: Fork and Clone the repository

Step 2: Setup Virtual Environment [Not required for colab/kaggle]

Step 3: Setup Environment Variables

Step 4: Follow the Instruction and Run `langgraph_self_reflective_rag.ipynb`

Implementation

GitHub Commit message format

← Metadata

Owner

Metadata

self-reflective-RAG self-reflective-RAG copied to clipboard

Metadata

Self Reflective RAG for a Business

Project Overview

How it works?

How to use?

Step 1: Fork and Clone the repository

Step 2: Setup Virtual Environment [Not required for colab/kaggle]

Step 3: Setup Environment Variables

Step 4: Follow the Instruction and Run langgraph_self_reflective_rag.ipynb

Implementation

GitHub Commit message format

← Metadata

Owner

Metadata

self-reflective-RAG
self-reflective-RAG copied to clipboard

Step 4: Follow the Instruction and Run `langgraph_self_reflective_rag.ipynb`