GPTCache
GPTCache copied to clipboard
Semantic cache for LLMs. Fully integrated with LangChain and llama_index.
### Current Behavior This is an issue relating to the integration of GPTCache with LangChain ``` import os import time import gptcache from gptcache.processor.pre import get_prompt from gptcache.manager.factory import get_data_manager...
### What would you like to be added? While it is possible to cache each LLM call, I notice that there is no way to cache the entire thought process...
### Current Behavior I test the offical similarity example in readme . ``` onnx = Onnx() data_manager = get_data_manager(CacheBase("sqlite"), VectorBase("faiss", dimension=onnx.dimension)) cache.init( embedding_func=onnx.to_embeddings, data_manager=data_manager, similarity_evaluation=SearchDistanceEvaluation(), ) ... ``` but it...
add support for paddlenlp embedding issue#281
### Is your feature request related to a problem? Please describe. Can huggingface LLM model chat caching be support? ### Describe the solution you'd like. _No response_ ### Describe an...
### Is your feature request related to a problem? Please describe. I want to use weaviate to find the K most similar requests from the input request's extracted embedding ###...
### What would you like to be added? RWKV Raven 7B Gradio Demo: https://huggingface.co/spaces/BlinkDL/Raven-RWKV-7B Use rwkv.cpp for CPU INT4 / INT8: https://github.com/saharNooby/rwkv.cpp Github project: https://github.com/BlinkDL/ChatRWKV Sample code using rwkv pip...