GraphRAG
GraphRAG copied to clipboard
In-depth study of the graphrag
πΎ DIGIMON: Deep Analysis of Graph-Based Retrieval-Augmented Generation (RAG) Systems
GraphRAG is a popular π₯π₯π₯ and powerful πͺπͺπͺ RAG system! ππ‘ Inspired by systems like Microsoft's, graph-based RAG is unlocking endless possibilities in AI.
Our project focuses on modularizing and decoupling these methods π§© to unveil the mystery π΅οΈββοΈπβ¨ behind them and share fun and valuable insights! π€©π«
Representative Methods
We select the following Graph RAG methods:
Graph Types
Based on the entity and relation, we categorize the graph into the following types:
- Chunk Tree: A tree structure formed by document content and summary.
- Passage Graph: A relational network composed of passages, tables, and other elements within documents.
- ER Graph: An Entity-Relation Graph, which contains only entities and relations, is commonly represented as triples.
- KG: A Knowledge Graph, which enriches entities with detailed descriptions and type information.
- RKG: A Rich Knowledge Graph, which further incorporates keywords associated with relations.
The criteria for the classification of graph types are as follows:
| Graph Attributes | Chunk Tree | Passage Graph | ER | KG | RKG |
|---|---|---|---|---|---|
| Original Content | β | β | β | β | β |
| Entity Name | β | β | β | β | β |
| Entity Type | β | β | β | β | β |
| Entity Description | β | β | β | β | β |
| Relation Name | β | β | β | β | β |
| Relation keyword | β | β | β | β | β |
| Relation Description | β | β | β | β | β |
| Edge Weight | β | β | β | β | β |
Operators in the Retrieve Stage
The retrieval stage lies the key role βΌοΈ in the entire GraphRAG process. β¨ The goal is to identify query-relevant content that supports the generation phase, enabling the LLM to provide more accurate responses.
π‘π‘π‘ After thoroughly reviewing all implementations, we've distilled them into a set of 16 operators π§©π§©. Each method then constructs its retrieval module by combining one or more of these operators π§©.
Five Types of Operators
We classify the operators into five categories, each offering a different way to retrieve and structure relevant information from graph-based data.
π Chunk Operators
retrieve the most relevant text segments (chunks) related to the query.
| Name | Description | Example Methods |
|---|---|---|
| by_ppr | Uses Personalized PageRank to identify relevant chunks. | HippoRAG |
| by_relationship | Finds chunks that contain specified relationships. | LightRAG |
| entity_occurrence | Retrieves chunks where both entities of an edge frequently appear together. | Local Search for MS GraphRAG |
βοΈ Entity Operators
retrieve entities (e.g., people, places, organizations) that are most relevant to the given query.
| Name | Description | Example Methods |
|---|---|---|
| by_relationship | Use key relationships to retrieve relevant entities | LightRAG |
| by_vdb | Find entities by vector-database | G-retrieverγ MedicalRAGγRAPTORγKGP |
| by_agent | Utilizes LLM to find the useful entities | TOG |
| by_ppr | Use PPR to retrieve entities | FastGraphRAG |
β‘οΈ Relationship Operators
extracting useful relationships for the given query.
| Name | Description | Example Methods |
|---|---|---|
| by_vdb | Retrieve relationships by vector-database | LightRAGγG-retriever |
| by_agent | Utilizes LLM to find the useful entities | TOG |
| by_entity | One-hot neighbors of the key entities | Local Search for MS GraphRAG |
| by_ppr | Use PPR to retrieve relationships | FastGraphRAG |
π Community Operators
Identify high-level information, which is only used for MS GraphRAG.
| Name | Description | Example Methods |
|---|---|---|
| by_entity | Detects communities containing specified entities | Local Search for MS GraphRAG |
| by_level | Returns all communities below a specified level | Global Search for MS GraphRAG |
π Subgraph Operators
Extract a relevant subgraph for the given query
| Name | Description | Example Methods |
|---|---|---|
| by_path | Retrieves a path | DALK |
| by_Steiner Tree | Constructs a minimal connecting subgraph (Steiner tree) | G-retriever |
| induced_subgraph | Extracts a subgraph induced by a set of entities and relationships. | TOG |
You can freely πͺ½ combine those operators π§© to create more and more GraphRAG methods.
π° Examples
Below, we present some examples illustrating how existing algorithms leverage these operators.
| Name | Operators |
|---|---|
| HippoRAG | Chunk (by_ppr) |
| LightRAG | Chunk (by_relationship) + Entity (by_relationship) + Relationship (by_vdb) |
| FastGraphRAG | Chunk (by_ppr) + Entity (by_ppr) + Relationship (by_ppr) |