crewAI
crewAI copied to clipboard
Fix #2753: Handle large inputs in memory by chunking text before embedding
Handle large inputs in memory by chunking text before embedding
Problem
When memory=True is enabled and a large input is provided, the system crashes with a token limit error from the embedding model. This happens because large inputs aren't being chunked or truncated before being passed to the embedding model.
Solution
- Added constants for chunk size and overlap in utilities/constants.py
- Implemented a _chunk_text method in RAGStorage to split large texts into smaller chunks
- Modified _generate_embedding to handle chunking and add each chunk to the collection
- Added a test to verify the fix works with large inputs
Testing
- Added a new test file large_input_memory_test.py to test memory with large inputs
- Verified that all existing tests still pass
Link to Devin run
https://app.devin.ai/sessions/472b1317d1074353b6a4dedc629755b8
Requested by: Joe Moura ([email protected])
🤖 Devin AI Engineer
I'll be helping with this pull request! Here's what you should know:
✅ I will automatically:
- Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
- Look at CI failures and help fix them
Note: I can only respond to comments from users who have write access to this repository.
⚙️ Control Options:
- [ ] Disable automatic comment and CI monitoring
Disclaimer: This review was made by a crew of AI Agents.
Code Review Comment for PR #2754
Overview
This pull request effectively addresses the issue of managing large text inputs within the RAG storage system by implementing a chunking mechanism. This improvement aids in handling memory limitations and prevents token limit errors during data processing. The PR introduces changes across three significant files and includes comprehensive test coverage.
Code Quality Findings and Suggestions
1. src/crewai/memory/storage/rag_storage.py
-
Positive Aspects:
- The introduction of the
_chunk_textmethod allows the system to handle large text inputs effectively, enhancing overall stability. - The implementation employs good error handling practices, including logging, which will aid in debugging.
- The introduction of the
-
Specific Improvements:
-
Method Documentation: The documentation for
_chunk_textshould detail the parameters and return types more explicitly. Example:def _chunk_text(self, text: str) -> List[str]: """ Split text into chunks to avoid token limits. Args: text: Input text to chunk. Returns: List[str]: A list of chunked text segments, adhering to defined size and overlap. ``` -
Type Hints Enhancement: Consider enhancing type hints, particularly in
_generate_embedding. -
Chunk Processing Optimization: Ensure that chunk generation is efficient to minimize performance overhead. Example:
start_indices = range(0, len(text), MEMORY_CHUNK_SIZE - MEMORY_CHUNK_OVERLAP)
-
2. src/crewai/utilities/constants.py
- Suggestions for Improvement:
- Provide clear documentation for constants such as
MEMORY_CHUNK_SIZEandMEMORY_CHUNK_OVERLAP. For instance:# Maximum size for each text chunk in characters MEMORY_CHUNK_SIZE = 4000
- Provide clear documentation for constants such as
3. tests/memory/large_input_memory_test.py
-
Positive Aspects:
- The newly created tests offer strong coverage for large input handling, validating the new chunking functionality.
-
Suggestions for Improvement:
- Add Edge Case Tests: Implement tests for edge cases, such as handling empty strings or inputs that match the chunk size exactly. For example:
def test_empty_input(short_term_memory): short_term_memory.save(value="", agent="test_agent")
- Add Edge Case Tests: Implement tests for edge cases, such as handling empty strings or inputs that match the chunk size exactly. For example:
Historical Context and Related Findings
In reviewing related pull requests, there has been a recurrent focus on improving input handling and error management. Previous discussions highlighted the need for better documentation and robust testing for newly implemented features, a trend that this PR continues.
General Recommendations
-
Performance Monitoring:
- Integrate logging for processing times and chunk sizes to better assess performance during heavy loads.
-
Memory Management:
- Implement limits on the number of processed chunks to prevent memory overflow and optimize resource allocation.
-
Error Handling:
- Enhance error handling throughout the chunking and embedding process to provide detailed logs for failures.
Conclusion
The changes presented in this PR establish a solid foundation for managing large text inputs while maintaining system performance. By addressing the outlined improvements, particularly enhancing documentation and testing coverage, the code can achieve greater clarity and robustness, making it better suited for future development needs.
Overall, this PR is a significant contribution to the project, effectively tackling the core issue at hand while promoting a maintainable and scalable codebase.
Closing due to inactivity for more than 7 days.