LightRAG
LightRAG copied to clipboard
Refact: Add Embedding Token Limit Configuration and Improve Error Handling
Refact: Add Embedding Token Limit Configuration and Improve Error Handling
Summary
This PR enhances the embedding functionality with configurable token limits, improved error handling, and better reliability. It consists of three related improvements that work together to provide a more robust embedding experience.
Motivation
The current implementation had several limitations:
- Token Overflow Risk: LLM-generated summaries could exceed embedding model token limits (e.g.,
bge-m3's 8192 token limit), causing silent failures - Inconsistent Error Handling: Bedrock embedding errors lacked proper retry mechanisms and specific exception types
- Missing Token Constraints: No way to configure or validate token limits for different embedding models
Changes
Add max_token_size parameter to embedding function decorators
- Added
max_token_size=8192parameter to all embedding function decorators - Moved
siliconcloudimplementation to deprecated folder - Imported
wrap_embedding_func_with_attrsfor consistent decorator usage - Updated
EmbeddingFuncdocstring with parameter documentation - Fixed langfuse import type annotation
Improve Bedrock error handling with retry logic and custom exceptions
-
Added specific exception types for better error classification
-
Implemented proper retry mechanism with exponential backoff
-
Enhanced error logging and validation
-
Enabled embedding retry decorator for resilience
-
Distinguishes between retryable and non-retryable errors
-
Provides detailed error context in logs
-
Automatically retries transient failures
-
Better debugging capabilities
Commit 3: Add configurable embedding token limit with validation
- Added
EMBEDDING_TOKEN_LIMITenvironment variable support - Automatically sets
max_token_sizeon embedding function initialization - Added
embedding_token_limitproperty toLightRAGclass - Implemented summary length validation against token limit
- Logs warning when summary exceeds 90% of token limit
Technical Implementation:
-
Environment Configuration (
lightrag/api/config.py)- Read
EMBEDDING_TOKEN_LIMITas integer (optional)
- Read
-
Server Initialization (
lightrag/api/lightrag_server.py)- Pass
max_token_sizeto embedding function if configured
- Pass
-
LightRAG Property (
lightrag/lightrag.py)- Safe accessor for embedding token limit from function
-
Validation (
lightrag/operate.py)- Check summary tokens in
_summarize_descriptions - Log detailed warning when threshold (90%) exceeded
- Check summary tokens in
Usage Example
# Set embedding token limit in .env file
EMBEDDING_TOKEN_LIMIT=8192
# Or as environment variable
export EMBEDDING_TOKEN_LIMIT=8192
Technical Details
- Backward Compatibility: All changes maintain full backward compatibility
- Safety Margin: 90% threshold provides buffer before hard limit
- Graceful Degradation: System continues to function when limit not configured
- Comprehensive Logging: Detailed warnings help identify potential issues early
Testing
- [x] Tested with various
EMBEDDING_TOKEN_LIMITvalues - [x] Verified warning logs appear at 90% threshold
- [x] Confirmed Bedrock retry mechanism works correctly
- [x] Validated backward compatibility without configuration
- [x] Tested with multiple embedding model providers
Breaking Changes
None - All changes are fully backward compatible.
@codex review
@codex review
@codex review
@codex review
@codex review
Codex Review: Didn't find any major issues. Another round soon, please!
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".