ragflow icon indicating copy to clipboard operation
ragflow copied to clipboard

feat: Implement hierarchical retrieval architecture (#11610)

Open hsparks-codes opened this issue 2 weeks ago • 1 comments

feat: Implement Hierarchical Retrieval Architecture (#11610)

Summary

This PR implements the complete three-tier hierarchical retrieval architecture as specified in Issue #11610, enabling production-grade RAG capabilities for RAGFlow.

The hierarchical approach addresses the "demo-to-production" gap by implementing layered filtering that improves recall precision, optimizes system performance, and enhances flexibility for complex production environments.

Changes

Tier 1: Knowledge Base Routing

Automatically routes user queries to the most relevant knowledge bases based on intent.

  • Per-KB Retrieval Parameters: KBRetrievalParams dataclass for independent configuration per knowledge base (vector/keyword weights, similarity thresholds, top_k, rerank settings)
  • Rule-based Routing: Keyword overlap scoring between query and KB descriptions
  • LLM-based Routing: Uses chat model to intelligently select relevant KBs with fallback to rule-based
  • Configurable Methods: auto, rule_based, llm_based, or all

Tier 2: Document Filtering

Applies document-level metadata filtering within selected knowledge bases.

  • Intelligent Metadata Filtering: Specify key metadata fields with LLM-generated filter conditions
  • Metadata Similarity Matching: Fuzzy matching for text-based metadata using embeddings
  • Enhanced Metadata Generation: generate_document_metadata() for full-text metadata and summary generation
  • Batch Metadata Management: Complete CRUD operations via MetadataService

Tier 3: Chunk Refinement

Performs precise vector retrieval at the chunk level within filtered document sets.

  • Parent-Child Chunking with Summary Mapping: Match macro-themes via summary vectors, then map to original chunks
  • Customizable Prompts: Configure custom prompts for keyword extraction and question generation
  • LLM Question Generation: Generate potential questions from chunks for improved retrieval

Metadata Management API

New REST API endpoints for efficient metadata management:

Endpoint Method Description
/metadata/batch/get POST Get metadata for multiple documents
/metadata/batch/update POST Update metadata in batch (merge or replace)
/metadata/batch/delete-fields POST Delete specific fields from multiple docs
/metadata/batch/set-field POST Set same value for field across docs
/metadata/schema/<kb_id> GET Get metadata schema for a KB
/metadata/statistics/<kb_id> GET Get metadata usage statistics
/metadata/search POST Search documents by metadata filters
/metadata/copy POST Copy metadata between documents

Files Changed

File Changes
rag/nlp/search.py Added HierarchicalConfig, KBRetrievalParams, HierarchicalResult dataclasses; hierarchical_retrieval() method with all tier implementations
agent/tools/retrieval.py Integrated hierarchical retrieval into agent's Retrieval tool
api/db/services/metadata_service.py New service for batch metadata CRUD operations
api/apps/metadata_app.py New REST API endpoints for metadata management
test/unit_test/nlp/test_hierarchical_retrieval.py 31 unit tests for hierarchical retrieval
test/unit_test/services/test_metadata_service.py 17 unit tests for metadata service

Configuration

The feature is disabled by default and fully backward-compatible. Enable via HierarchicalConfig:

from rag.nlp.search import HierarchicalConfig, KBRetrievalParams

config = HierarchicalConfig(
    enabled=True,
    # Tier 1
    enable_kb_routing=True,
    kb_routing_method="auto",  # "auto", "rule_based", "llm_based", "all"
    kb_routing_threshold=0.3,
    kb_top_k=3,
    kb_params={
        "finance_kb": KBRetrievalParams(
            kb_id="finance_kb",
            vector_similarity_weight=0.9,
            similarity_threshold=0.3
        )
    },
    # Tier 2
    enable_doc_filtering=True,
    metadata_fields=["department", "doc_type"],
    use_llm_metadata_filter=True,
    enable_metadata_similarity=True,
    # Tier 3
    enable_parent_child=True,
    use_summary_mapping=True,
    keyword_extraction_prompt="Extract domain-specific terms",
    use_llm_question_generation=True,
)

Testing

# Run all hierarchical retrieval tests
python3 -m pytest test/unit_test/nlp/test_hierarchical_retrieval.py -v -p no:unraisableexception

# Run metadata service tests
python3 -m pytest test/unit_test/services/test_metadata_service.py -v

# Run all tests together (48 tests)
python3 -m pytest test/unit_test/nlp/test_hierarchical_retrieval.py test/unit_test/services/test_metadata_service.py -v -p no:unraisableexception

Result: 48 tests passing

Expected Benefits

  1. Improved Recall Precision: Layered filtering focuses on relevant regions, reducing interference from irrelevant chunks
  2. Optimized Performance: Significantly reduces vector search candidate sets, lowering computational overhead
  3. Enhanced Intelligence: KB routing and metadata filtering enable better understanding of user intent
  4. Reduced Operational Costs: Batch metadata management minimizes maintenance overhead

Checklist

  • [x] All requirements from Issue #11610 implemented
  • [x] Tier 1: KB Routing (rule-based + LLM-based)
  • [x] Tier 2: Document Filtering (metadata + similarity)
  • [x] Tier 3: Chunk Refinement (parent-child + custom prompts)
  • [x] Per-KB retrieval parameters
  • [x] Batch metadata CRUD operations
  • [x] REST API endpoints
  • [x] Unit tests (48 passing)
  • [x] Linting passes (ruff)
  • [x] Non-breaking (disabled by default)
  • [x] Backward compatible

Related Issues

Closes #11610

hsparks-codes avatar Dec 09 '25 06:12 hsparks-codes

@KevinHuSh @TeslaZY @cike8899 Would you please check the PR and give me your feedbacks?

hsparks-codes avatar Dec 11 '25 11:12 hsparks-codes

Thanks for your contribution, however, it has nothing to do with the intent of the issue. Also, please note that AI is just an assistant for a feature request, you need to fully understand what the feature request is going to be designed, and take charge of the code generated.

yingfeng avatar Dec 12 '25 03:12 yingfeng

Moreover, we oppose the use of submitting PRs as a tool for blockchain mining . If you're genuinely interested in contributing to the open-source community, then you should take responsibility for the feature request itself—not spend minutes having AI generate code and then wait for it to be merged as your contribution to blockchain mining.

yingfeng avatar Dec 12 '25 04:12 yingfeng

Sure no problem. I am willing to take responsibility for the feature request.

hsparks-codes avatar Dec 12 '25 04:12 hsparks-codes