SimFG
SimFG
@swatataidbuddy Because the core factor in determining whether two questions are similar is the choice of embedding model. If you want to be very accurate, your model must be very...
which version of openai you are using?
@swatataidbuddy Based on the return value of openai, this does not appear to be openai 0.28, like `system_fingerprint`
@kenneth1003 you can try to use the gptcache server. Of course, it is also nice to develop a basic npm package.
refer to: https://github.com/zilliztech/GPTCache/issues/585#issuecomment-1972720103 you should give the `cache_obj` param for the `init` func, like: ``` def init_gptcache(cache_obj: Cache, llm: str): print(cache.has_init) cache.init( cache_obj=cache_obj, pre_embedding_func=get_content_func, embedding_func=OpenAIEmbeddings(model="text-embedding-3-small").embed_query, data_manager=data_manager, similarity_evaluation=SearchDistanceEvaluation(), ) print(cache.has_init) ```
@theinhumaneme This seems to be the wrong format of the custom embedding function. you can refer to: https://github.com/zilliztech/GPTCache/blob/main/gptcache/embedding/openai.py
@theinhumaneme or, you can show the embed_query func, maybe i can give you some advice
@theinhumaneme You cannot put langchain's embedding methods into gptcache because they are incompatible, and gptcache will not be considered when langchain is modified.
If someone has better ideas, suggestions are welcome. I have opened the pr: https://github.com/zilliztech/GPTCache/pull/614. I don’t merge the pr and bump the new version. I actually want to hear more...
@songsey You can try to reduce the similarity_threshold, such as 0.9 or 0.8, you will get different results.