kernel-memory
kernel-memory copied to clipboard
[Feature Request]
Context / Scenario
That's a feature that already exists in langchain and will be beneficial to save costs. The idea will be to ported from phyton to c#
https://github.com/zilliztech/GPTCache
https://python.langchain.com/docs/integrations/llms/llm_caching/
https://www.mongodb.com/developer/products/atlas/advanced-rag-langchain-mongodb/
The problem
A lot of money and performance is waste answering the same questions again and again
Proposed solution
Be able to save all the queries together with llm responses in some database and try to fetch it first from there, if not then call the LLM, and have some parameters to invalidate or update cache from time to time.
Importance
would be great to have