agent-zero
agent-zero copied to clipboard
Low Speed of AutoMemory
Hi @frdel, thanks for your great work!
I found AutoMemory to be quite inefficient because it produces a summarization of the same memories at every step of the conversation.
Suggestion
My suggestion is to implement a summarization cache. When memories are fetched from the vector DB for the first time, they should be summarized and stored in the cache. The next time the same memory is fetched, the summary can be retrieved from the cache instead of generating it again.
Benefits
- Efficiency: We won't need to re-summarize the same memories repeatedly, saving the Agent’s time.
- Cost reduction: This approach saves tokens, as fewer token-consuming summarizations are needed.
Even simple in-memory solution will significantly improve Agent speed.
Let me know what you think about this suggestion, and I'd be happy to assist with a PR!
Such cache would have to be a vector database for semantic search with similarity score. This would require data preparation and comparison as well, thus neglecting the benefit of a cache.