langchain icon indicating copy to clipboard operation
langchain copied to clipboard

save standalone question and implement semantic cache

Open anshumantanwar opened this issue 1 year ago • 2 comments

now we can access individual "standalone question" from BaseConversationalRetrievalChain object.

Semantic Cache : Now before calling openai, we will check standalone question in vector db (cache_namespace = "cache-" + namespace) if similar question has already been asked.

use_cache flag can be used by user. If set True, cache for semantic search would be used

User can select similarity threshold value (cache_similarity_threshold) and its default value is 0.85

anshumantanwar avatar Apr 22 '23 20:04 anshumantanwar

some high level questions

  1. is this something we put on the Retriever object (not the ConversationRetrievalChain)?
  2. in general is this something we should let retriever integrations handle?
  3. how much does this current implementation save, if we need to query a vector store anyways

dev2049 avatar May 15 '23 18:05 dev2049

Please find below answers to your question.

  1. is this something we put on the Retriever object (not the ConversationRetrievalChain)? Ans: Yes. Retriever will be responsible for this task. Although separate flag is sent while creating object of ConversationRetrievalChain (if true retriever will look for cached Q&A)

  2. in general is this something we should let retriever integrations handle? Ans: Yes because retriever has direct access to vector DB (where we are caching Q&A)

  3. how much does this current implementation save, if we need to query a vector store anyways Ans: I noticed 70% reduction in cost and 5X improvement in speed

I have written small article around this ( https://medium.com/@anshuman.tanwar.iitr/save-70-openai-costing-by-caching-results-91433cb3e85d ) .

@dev2049

anshumantanwar avatar May 18 '23 11:05 anshumantanwar

Hey @anshumantanwar ! This PR no longer lines up with the current directory structure of the library (would need to be in /libs/langchain/langchain instead of /langchain). Would you be interested in updating to the new format, or would it be better to close this and someone can work on a new PR off the current state of the library (using this as a reference implementation)?

efriis avatar Nov 07 '23 04:11 efriis

I'll close for now, and let me know if you'd like me to reopen!

efriis avatar Nov 07 '23 19:11 efriis