GPTCache
GPTCache copied to clipboard
[Enhancement]: Caching Support for Agents
What would you like to be added?
While it is possible to cache each LLM call, I notice that there is no way to cache the entire thought process and subsequent output from an Agent call e.g., LLMSingleActionAgent
from langchain. Any way that this can be achieved?
Why is this needed?
Agents will be increasingly important and heavily utilized
Anything else?
No response
You can try to use the GPTCache api
, which provides get
and put
methods. a simple example:
from gptcache.adapter.api import put, get, init_similar_cache
init_similar_cache()
put(question, answer)
get(similar_question)
Of course, you can also integrate GPTCache into other llm models or applications faster by this method
You can try to use the GPTCache
api
, which providesget
andput
methods. a simple example:from gptcache.adapter.api import put, get, init_similar_cache init_similar_cache() put(question, answer) get(similar_question)
Of course, you can also integrate GPTCache into other llm models or applications faster by this method
I thought that won't help too much. @kennethleungty are you ask for context?
now you can try to use the context processor to handle the long prompt, like:
from gptcache.processor.context.summarization_context import SummarizationContextProcess
from gptcache import cache
context_process = SummarizationContextProcess()
cache.init(
pre_embedding_func=context_process.pre_process,
...
)
or selective context, like:
context_processor = SelectiveContextProcess()