LightRAG Refact: Add Embedding Token Limit Configuration and Improve Error Handling

Refact: Add Embedding Token Limit Configuration and Improve Error Handling

Summary

This PR enhances the embedding functionality with configurable token limits, improved error handling, and better reliability. It consists of three related improvements that work together to provide a more robust embedding experience.

Motivation

The current implementation had several limitations:

Token Overflow Risk: LLM-generated summaries could exceed embedding model token limits (e.g., bge-m3's 8192 token limit), causing silent failures
Inconsistent Error Handling: Bedrock embedding errors lacked proper retry mechanisms and specific exception types
Missing Token Constraints: No way to configure or validate token limits for different embedding models

Changes

Add max_token_size parameter to embedding function decorators

Added max_token_size=8192 parameter to all embedding function decorators
Moved siliconcloud implementation to deprecated folder
Imported wrap_embedding_func_with_attrs for consistent decorator usage
Updated EmbeddingFunc docstring with parameter documentation
Fixed langfuse import type annotation

Improve Bedrock error handling with retry logic and custom exceptions

Added specific exception types for better error classification
Implemented proper retry mechanism with exponential backoff
Enhanced error logging and validation
Enabled embedding retry decorator for resilience
Distinguishes between retryable and non-retryable errors
Provides detailed error context in logs
Automatically retries transient failures
Better debugging capabilities

Commit 3: Add configurable embedding token limit with validation

Added EMBEDDING_TOKEN_LIMIT environment variable support
Automatically sets max_token_size on embedding function initialization
Added embedding_token_limit property to LightRAG class
Implemented summary length validation against token limit
Logs warning when summary exceeds 90% of token limit

Technical Implementation:

Environment Configuration (lightrag/api/config.py)
- Read EMBEDDING_TOKEN_LIMIT as integer (optional)
Server Initialization (lightrag/api/lightrag_server.py)
- Pass max_token_size to embedding function if configured
LightRAG Property (lightrag/lightrag.py)
- Safe accessor for embedding token limit from function
Validation (lightrag/operate.py)
- Check summary tokens in _summarize_descriptions
- Log detailed warning when threshold (90%) exceeded

Usage Example

# Set embedding token limit in .env file
EMBEDDING_TOKEN_LIMIT=8192

# Or as environment variable
export EMBEDDING_TOKEN_LIMIT=8192

Technical Details

Backward Compatibility: All changes maintain full backward compatibility
Safety Margin: 90% threshold provides buffer before hard limit
Graceful Degradation: System continues to function when limit not configured
Comprehensive Logging: Detailed warnings help identify potential issues early

Testing

[x] Tested with various EMBEDDING_TOKEN_LIMIT values
[x] Verified warning logs appear at 90% threshold
[x] Confirmed Bedrock retry mechanism works correctly
[x] Validated backward compatibility without configuration
[x] Tested with multiple embedding model providers

Breaking Changes

None - All changes are fully backward compatible.

Nov 14 '25 11:11 danielaskdd

@codex review

Nov 14 '25 11:11 danielaskdd

@codex review

Nov 14 '25 12:11 danielaskdd

@codex review

Nov 14 '25 14:11 danielaskdd

@codex review

Nov 14 '25 14:11 danielaskdd

@codex review

Nov 14 '25 15:11 danielaskdd

Codex Review: Didn't find any major issues. Another round soon, please!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Nov 14 '25 15:11 chatgpt-codex-connector[bot]

LightRAG LightRAG copied to clipboard

Refact: Add Embedding Token Limit Configuration and Improve Error Handling

Refact: Add Embedding Token Limit Configuration and Improve Error Handling

Summary

Motivation

Changes

Add max_token_size parameter to embedding function decorators

Improve Bedrock error handling with retry logic and custom exceptions

Commit 3: Add configurable embedding token limit with validation

Usage Example

Technical Details

Testing

Breaking Changes

LightRAG
LightRAG copied to clipboard