langchain openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens, however you requested 11836 tokens (11580 in your prompt; 256 for the completion). Please reduce your prompt; or completion length.

this is my code for hooking up an LLM to answer questions over a database(remote pg).

but find error:

Can anyone give me some advice to solve this problem？

Apr 03 '23 10:04 wen020

Take a look https://github.com/hwchase17/langchain/issues/2133#issuecomment-1491522064

Apr 03 '23 10:04 sergerdn

Take a look #2133 (comment)

this a good job，but i dont know how to set reduce_k_below_max_tokens=True？Can you give me some examples

Apr 03 '23 10:04 wen020

same issue here, have you solved ?

May 03 '23 08:05 FahriKhalid

Take a look #2133 (comment)

this a good job，but i dont know how to set reduce_k_below_max_tokens=True？Can you give me some examples

here is an example:

chain_type_kwargs = {"prompt": prompt}
llm = ChatOpenAI(
    model_name="gpt-3.5-turbo", 
    temperature=0, 
    max_tokens=1000)  
chain = VectorDBQAWithSourcesChain.from_chain_type(
    llm=llm,
    vectorstore=index.store,
    return_source_documents=True,
    chain_type_kwargs=chain_type_kwargs,
    reduce_k_below_max_tokens=True,
)

https://github.com/Laisky/HelloWorld/blob/master/py3/ailangchain/security.ipynb

May 06 '23 02:05 Laisky

Take a look #2133 (comment)

this a good job，but i dont know how to set reduce_k_below_max_tokens=True？Can you give me some examples

here is an example:
chain_type_kwargs = {"prompt": prompt}
llm = ChatOpenAI(
    model_name="gpt-3.5-turbo", 
    temperature=0, 
    max_tokens=1000)  
chain = VectorDBQAWithSourcesChain.from_chain_type(
    llm=llm,
    vectorstore=index.store,
    return_source_documents=True,
    chain_type_kwargs=chain_type_kwargs,
    reduce_k_below_max_tokens=True,
)
https://github.com/Laisky/HelloWorld/blob/master/py3/ailangchain/security.ipynb

God Bless You

May 06 '23 14:05 MarabhaTheGreat

If you are getting this error: openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens, however you requested 4638 tokens (4382 in your prompt; 256 for the completion). Please reduce your prompt; or completion length. The solution is bellow[change your host port accordingly]. The trick is to include only tables that you want. I have given bellow an actual code snippet

from langchain import SQLDatabaseChain from sqlalchemy import create_engine import os

os.environ["OPENAI_API_KEY"] = 'XXXXXX'

engine=create_engine('mysql+pymysql://admin:admin@localhost:3307/wordpress1') include_tables=['wp_greetings']

db = SQLDatabase(engine, include_tables=include_tables) #This will only include a list of tables you want!!! llm = OpenAI(temperature=0, verbose=True,max_tokens=1000) db_chain = SQLDatabaseChain(llm=llm, database=db, verbose=True)

db_chain.run("Describe wp_greetings table")

May 21 '23 05:05 shuvracse03

I tried to limit the content length by using the following code:

llm = OpenAI(temperature=0, verbose=True, max_tokens=2000)

toolkit = SQLDatabaseToolkit(db=db, llm=llm, max_tokens = 2000)

agent_executor = create_sql_agent(
    llm=OpenAI(temperature=0),
    toolkit=toolkit,
    verbose=True,
    reduce_k_below_max_tokens=True,
    max_tokens = 2000
)

But that didn't seem to work. I still got the following error:

InvalidRequestError: This model's maximum context length is 4097 tokens, however you requested 4151 tokens (3895 in your prompt; 256 for the completion). Please reduce your prompt; or completion length.

Is this something wrong with the way I'm using the agent, or is this a bug in langchain? I am using sql agent on a large database so something may be wrong with how I'm using it.

Jun 01 '23 09:06 davidfan168

David, you can try including only the necessary table as I shown above. This will definitely decrease number of tokens

Jun 01 '23 19:06 shuvracse03

David, you can try including only the necessary table as I shown above. This will definitely decrease number of tokens

Thanks for the suggestion!

I only have two tables in my database, but the tables have a ton of columns in them. In this case should I try to build my own agent with the ability to summarize tables?

Jun 02 '23 02:06 davidfan168

Hi! I'm encountering the same issue as David. Is there a method to track token usage as the program runs? I think it'd be very beneficial for me to monitor what is actually being used as a token and potentially finding reductions that way.

Jun 09 '23 20:06 foshesss

Hi, @wen020! I'm Dosu, and I'm helping the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.

From what I understand, the issue is about the maximum context length for the model being 4097 tokens, but you are requesting 11836 tokens. In the comments, there are suggestions and examples provided by users on how to solve this problem. Some users suggest setting reduce_k_below_max_tokens=True to reduce the token length, while others suggest including only necessary tables to decrease the number of tokens. Additionally, there is a question about whether there is a method to track token usage during program execution.

Before we close this issue, we wanted to check with you if it is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself or it will be automatically closed in 7 days.

Thank you for your understanding and contribution to the LangChain project!

Sep 22 '23 16:09 dosubot[bot]

langchain langchain copied to clipboard

openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens, however you requested 11836 tokens (11580 in your prompt; 256 for the completion). Please reduce your prompt; or completion length.

langchain
langchain copied to clipboard