Vectorsearch::Pinecone - `#ask` tokens from context exceed max tokens allowed
Context
When asking a question, depending on how have been stored the data, it's possible to have a context exceeding the max tokens of the LLM. https://github.com/andreibondarev/langchainrb/blob/ccd0fd53a9737fb61c82058e86da1c9b855ccd7f/lib/langchain/vectorsearch/pinecone.rb#L113-L123
Suggestion
In this #ask method, we could:
- Add the possibility to chose the quantity (k) of context we want to get when querying the DB
- Add a custom logic once we have the context elements and buidling the context to avoid having a certain quantity of tokens
WDYT?
I noticed this too with openai and proposed something here: https://github.com/andreibondarev/langchainrb/issues/123 to to automatically set the quantity. So ideally either let it get set automatically, or choose it?
@mael-ha
- You'd like to introduce a new
k:param in theask()method to pass it down to thesimilarity_search(), right? It makes a lot of sense! I think we should do it for other Vectorsearch classes as well. - We've got TokenLengthValidator but it currently only works for OpenAI. It's called when the OpenAI#chat() method is invoked: https://github.com/andreibondarev/langchainrb/blob/main/lib/langchain/llm/openai.rb#L105-L107. Did you not see this error raised?
I think the future of vector storage usage is within the context of the Assistant, not via the .ask() method.