agent-zero Low Speed of AutoMemory

Low Speed of AutoMemory

Open alexey2baranov opened this issue 1 year ago • 1 comments

Hi @frdel, thanks for your great work!

I found AutoMemory to be quite inefficient because it produces a summarization of the same memories at every step of the conversation.

Suggestion

My suggestion is to implement a summarization cache. When memories are fetched from the vector DB for the first time, they should be summarized and stored in the cache. The next time the same memory is fetched, the summary can be retrieved from the cache instead of generating it again.

Benefits

Efficiency: We won't need to re-summarize the same memories repeatedly, saving the Agent’s time.
Cost reduction: This approach saves tokens, as fewer token-consuming summarizations are needed.

Even simple in-memory solution will significantly improve Agent speed.

Let me know what you think about this suggestion, and I'd be happy to assist with a PR!

Oct 02 '24 07:10 alexey2baranov

Such cache would have to be a vector database for semantic search with similarity score. This would require data preparation and comparison as well, thus neglecting the benefit of a cache.

Oct 06 '24 06:10 frdel

agent-zero agent-zero copied to clipboard

Low Speed of AutoMemory

Suggestion

Benefits

agent-zero
agent-zero copied to clipboard