langchain Handling tokens exceeding exception in Stuff Chain

Hello, I love using the stuff chain for my docs/blogs; it gives me better answers, is faster, and is cheaper. However, sometimes some questions lead to token-exceeded errors from OpenAI; I thought maybe in those cases, I could reduce the K value and try again.

I ended up writing this for my use case,

tiktoken_encoder  = tiktoken.get_encoding("gpt2")

def page_content(doc):
    return doc.page_content
def reduce_tokens_below_limit(docs, limit=3400):
    tokens = len(tiktoken_encoder.encode("".join(map(page_content, docs))))
    return docs if (tokens <= limit) else reduce_tokens_below_limit(docs[:-1])

class MyVectorDBQAWithSourcesChain(VectorDBQAWithSourcesChain):
    def _get_docs(self, inputs):
        question = inputs[self.question_key]
        docs = self.vectorstore.similarity_search(question, k=self.k)
        return reduce_tokens_below_limit(docs)

I used recursion because I liked the clarity of the method, and the K value isn't going to be higher than single-digit numbers in a practical scenario.

I would love to contribute this back upstream; where do you suggest I include this, and what should I name it? I'll take care of docs and other things for this use case. :)

Jan 22 '23 06:01 Who828

i love this idea! i would love to add this add a paramter to the VectorDBQAWithSourcesChain where if True it does what you described, but if False it has the current functionality. can probably default to True

Jan 25 '23 07:01 hwchase17

I tried using reduce_k_below_max_tokens but it doesn't seem to work. I still get the "InvalidRequestError: This model's maximum context length is 4096 tokens" error.

Code snippet:

qa_chain = load_qa_with_sources_chain(OpenAIChat(temperature=0), chain_type="stuff")
qa = VectorDBQAWithSourcesChain(combine_documents_chain=qa_chain, vectorstore=vectorstore, reduce_k_below_max_tokens=True)

Mar 07 '23 03:03 harithzulfaizal

I am getting the same error with "Stuff" chain

Mar 15 '23 10:03 AldawsariNLP

Does any know how to fix this? My work around solution right is trying to keep indexed documents smaller

Mar 17 '23 05:03 dxv2k

did you find any solution @dxv2k ?

Jun 05 '23 23:06 SinaArdehali

Hi, @Who828! I'm Dosu, and I'm helping the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.

From what I understand, the issue is about handling token-exceeded errors from OpenAI in the Stuff Chain. You provided a solution using recursion to reduce the tokens below a specified limit and expressed your intention to contribute it back upstream. There have been comments from other users experiencing the same error and seeking a solution. One user even suggested adding a parameter to the VectorDBQAWithSourcesChain to handle this functionality.

Before we close this issue, we wanted to check if it is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.

Thank you for your understanding and contribution to the LangChain project!

Sep 20 '23 16:09 dosubot[bot]