AutoGPT icon indicating copy to clipboard operation
AutoGPT copied to clipboard

openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens. However, your messages resulted in 4822 tokens. Please reduce the length of the messages.

Open DeekshithPoojary opened this issue 1 year ago • 10 comments

How to fix this?

#2637 #2531 #3557 #3256 #3889 #2637

DeekshithPoojary avatar May 06 '23 08:05 DeekshithPoojary

this can be fixed by chunking . here is an example code : def chunk_text(text: str, chunk_size: int = 4096) -> list[str]: tokens = openai.Tokenizer.tokenize(text) chunks = [] current_chunk = []

for token in tokens:
    current_chunk.append(token)
    if sum(len(t) for t in current_chunk) >= chunk_size:
        chunks.append("".join(current_chunk))
        current_chunk = []

if current_chunk:
    chunks.append("".join(current_chunk))

return chunks

def process_chunks(chunks: list[str]) -> list[str]: results = []

for chunk in chunks:
    response = openai.Completion.create(engine="gpt-4", prompt=chunk, max_tokens=100)
    results.append(response.choices[0].text.strip())

return results

text = "your_very_long_text_here" chunks = chunk_text(text) responses = process_chunks(chunks)

combined_response = " ".join(responses)

this will split text into smaller chunks, process each chunk separately, and then combine the responses. or what we could do is enginerring mixed with chunking . note that the context might get lost during the process of chunking what we can do now is to check the chunk if there is any context missing . import openai

def analyze_combined_response(response: str) -> str: tokens = openai.Tokenizer.tokenize(response) improved_response = []

for i, token in enumerate(tokens[:-1]):
    current_token = token
    next_token = tokens[i + 1]

    # Check if there's a possible break in the context
    if some_condition_to_detect_context_break(current_token, next_token):
        improved_token = generate_more_coherent_token(current_token, next_token)
        improved_response.append(improved_token)
    else:
        improved_response.append(current_token)

improved_response.append(tokens[-1])
return "".join(improved_response)

def some_condition_to_detect_context_break(current_token: str, next_token: str) -> bool: # Implement the logic to detect context breaks here pass

def generate_more_coherent_token(current_token: str, next_token: str) -> str: # Use the AI model to generate a more coherent token to bridge the gap prompt = f"{current_token} {next_token}" response = openai.Completion.create(engine="gpt-4", prompt=prompt, max_tokens=100) return response.choices[0].text.strip()

Use the analyze_combined_response function on the combined_response

improved_response = analyze_combined_response(combined_response) the work flow can be Receive the large input text (e.g., a paragraph or a paper). Split the text into smaller chunks without losing context. Process each chunk with the AI model. Analyze the chunks and try to maintain context continuity between them. Combine the processed chunks into the final output

Aavas13 avatar May 06 '23 09:05 Aavas13

openai.error.InvalidRequestError: This model's maximum context length is 8192 tokens. However, your messages resulted in 16018 tokens. Please reduce the length of the messages. Press any key to continue...

waiting for the project to solve it

VectorZhao avatar May 06 '23 14:05 VectorZhao

Same issue

NSBCypher avatar May 06 '23 21:05 NSBCypher

same issue here

B2Gdevs avatar May 06 '23 22:05 B2Gdevs

same issue

abah avatar May 07 '23 00:05 abah

Same issue. Unfortunately I've never been able to have a project complete because of this. Always errors out for the reason mentioned above.

carterrees-entrata avatar May 07 '23 05:05 carterrees-entrata

same here: File "/usr/local/lib/python3.10/site-packages/openai/api_requestor.py", line 682, in _interpret_response_line raise self.handle_error_response( openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens. However, your messages resulted in 4139 tokens. Please reduce the length of the messages

krystian-ai avatar May 07 '23 20:05 krystian-ai

Same here, a bit frustrating, not a single project has come to an end due this. In my case it just ends the process.

KingNeza avatar May 07 '23 23:05 KingNeza

This seems to be a context limitation from OpenAI's side. So basically if your task requires deep research and a lot of context, GPT is simply unable to remember and process all that data yet. One way to increase the limit is by summarizing the data, but you will still be limited. There is also a version of GPT-4 that allows a context length of 32k, but it is not yet publicly available. Please find more info in the links below: https://community.openai.com/t/how-to-increase-context-length/32285/9 https://platform.openai.com/docs/models/gpt-4

cyphercodes avatar May 08 '23 23:05 cyphercodes

openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens. However, you requested 4155 tokens (2533 in the messages, 1622 in the completion). Please reduce the length of the messages or completion.

I can only complete a full project once this issue has been resolved.

VectorZhao avatar May 16 '23 04:05 VectorZhao

This seems like a simple fix. I'm having the same problem. I'll look into it tomorrow. Should be a simple tiktoken for loop counter followed by an if-else break statement. None of my projects really make it past the first 20 commands before crashing from max token errors and that's with gpt-4 for both fast and smart

JoshJarabek7 avatar May 20 '23 15:05 JoshJarabek7

me2

iamsuperzb avatar May 24 '23 02:05 iamsuperzb

Same for me. openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens. However, your messages resulted in 8248 tokens. Please reduce the length of the messages.

bb071988 avatar May 28 '23 22:05 bb071988

Same here.

IMO there are two key issues here:

  1. Auto-GPT crashes and requires a restart once this error occurs (which suggest the exception isn't caught and handled appropriately?). There may also be other classes of exeption thrown by the openai library that throw errors which crash the main program.
  2. It might be a good idea to do some basic validation of the request before invoking it (i.e. checking that the request isn't longer than 4097 tokens)

bishi3000 avatar Jun 14 '23 21:06 bishi3000

Please post a full log when reporting an issue like this. Without a log, we can't debug.

Pwuts avatar Jun 15 '23 14:06 Pwuts

@Pwuts I ran into this on Master Branch 7/12/23 debug logs activity.log

  File "C:\Users\Nips\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\Nips\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Users\Nips\Desktop\NeonFork\Auto-GPT\autogpt\__main__.py", line 5, in <module>
    autogpt.cli.main()
  File "C:\Users\Nips\AppData\Local\Programs\Python\Python310\lib\site-packages\click\core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "C:\Users\Nips\AppData\Local\Programs\Python\Python310\lib\site-packages\click\core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "C:\Users\Nips\AppData\Local\Programs\Python\Python310\lib\site-packages\click\core.py", line 1635, in invoke
    rv = super().invoke(ctx)
  File "C:\Users\Nips\AppData\Local\Programs\Python\Python310\lib\site-packages\click\core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "C:\Users\Nips\AppData\Local\Programs\Python\Python310\lib\site-packages\click\core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "C:\Users\Nips\AppData\Local\Programs\Python\Python310\lib\site-packages\click\decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "C:\Users\Nips\Desktop\NeonFork\Auto-GPT\autogpt\cli.py", line 117, in main
    run_auto_gpt(
  File "C:\Users\Nips\Desktop\NeonFork\Auto-GPT\autogpt\main.py", line 205, in run_auto_gpt
    agent.start_interaction_loop()
  File "C:\Users\Nips\Desktop\NeonFork\Auto-GPT\autogpt\agent\agent.py", line 136, in start_interaction_loop
    assistant_reply = chat_with_ai(
  File "C:\Users\Nips\Desktop\NeonFork\Auto-GPT\autogpt\llm\chat.py", line 111, in chat_with_ai
    new_summary_message, trimmed_messages = agent.history.trim_messages(
  File "C:\Users\Nips\Desktop\NeonFork\Auto-GPT\autogpt\memory\message_history.py", line 73, in trim_messages
    new_summary_message = self.update_running_summary(
  File "C:\Users\Nips\Desktop\NeonFork\Auto-GPT\autogpt\memory\message_history.py", line 203, in update_running_summary
    self.summarize_batch(batch, config, max_summary_length)
  File "C:\Users\Nips\Desktop\NeonFork\Auto-GPT\autogpt\memory\message_history.py", line 224, in summarize_batch
    self.summary = create_chat_completion(
  File "C:\Users\Nips\Desktop\NeonFork\Auto-GPT\autogpt\llm\utils\__init__.py", line 157, in create_chat_completion
    response = iopenai.create_chat_completion(
  File "C:\Users\Nips\Desktop\NeonFork\Auto-GPT\autogpt\llm\providers\openai.py", line 147, in metered_func
    return func(*args, **kwargs)
  File "C:\Users\Nips\Desktop\NeonFork\Auto-GPT\autogpt\llm\providers\openai.py", line 182, in _wrapped
    return func(*args, **kwargs)
  File "C:\Users\Nips\Desktop\NeonFork\Auto-GPT\autogpt\llm\providers\openai.py", line 231, in create_chat_completion
    completion: OpenAIObject = openai.ChatCompletion.create(
  File "C:\Users\Nips\AppData\Local\Programs\Python\Python310\lib\site-packages\openai\api_resources\chat_completion.py", line 25, in create
    return super().create(*args, **kwargs)
  File "C:\Users\Nips\AppData\Local\Programs\Python\Python310\lib\site-packages\openai\api_resources\abstract\engine_api_resource.py", line 153, in create
    response, _, api_key = requestor.request(
  File "C:\Users\Nips\AppData\Local\Programs\Python\Python310\lib\site-packages\openai\api_requestor.py", line 226, in request
    resp, got_stream = self._interpret_response(result, stream)
  File "C:\Users\Nips\AppData\Local\Programs\Python\Python310\lib\site-packages\openai\api_requestor.py", line 619, in _interpret_response
    self._interpret_response_line(
  File "C:\Users\Nips\AppData\Local\Programs\Python\Python310\lib\site-packages\openai\api_requestor.py", line 682, in _interpret_response_line
    raise self.handle_error_response(
openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens. However, your messages resulted in 5121 tokens. Please reduce the length of the messages.```

NeonN3mesis avatar Jul 13 '23 03:07 NeonN3mesis

Same here, hitting it after 6th website analyzed:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/local/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/app/autogpt/__main__.py", line 5, in <module>
    autogpt.cli.main()
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1666, in invoke
    rv = super().invoke(ctx)
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/click/decorators.py", line 34, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/app/autogpt/cli.py", line 117, in main
    run_auto_gpt(
  File "/app/autogpt/main.py", line 205, in run_auto_gpt
    agent.start_interaction_loop()
  File "/app/autogpt/agent/agent.py", line 134, in start_interaction_loop
    assistant_reply = chat_with_ai(
  File "/app/autogpt/llm/chat.py", line 115, in chat_with_ai
    new_summary_message, trimmed_messages = agent.history.trim_messages(
  File "/app/autogpt/memory/message_history.py", line 77, in trim_messages
    new_summary_message = self.update_running_summary(
  File "/app/autogpt/memory/message_history.py", line 200, in update_running_summary
    self.summarize_batch(batch, config)
  File "/app/autogpt/memory/message_history.py", line 229, in summarize_batch
    self.summary = create_chat_completion(prompt, config).content
  File "/app/autogpt/llm/utils/__init__.py", line 145, in create_chat_completion
    response = iopenai.create_chat_completion(
  File "/app/autogpt/llm/providers/openai.py", line 149, in metered_func
    return func(*args, **kwargs)
  File "/app/autogpt/llm/providers/openai.py", line 186, in _wrapped
    return func(*args, **kwargs)
  File "/app/autogpt/llm/providers/openai.py", line 227, in create_chat_completion
    completion: OpenAIObject = openai.ChatCompletion.create(
  File "/usr/local/lib/python3.10/site-packages/openai/api_resources/chat_completion.py", line 25, in create
    return super().create(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/openai/api_resources/abstract/engine_api_resource.py", line 153, in create
    response, _, api_key = requestor.request(
  File "/usr/local/lib/python3.10/site-packages/openai/api_requestor.py", line 226, in request
    resp, got_stream = self._interpret_response(result, stream)
  File "/usr/local/lib/python3.10/site-packages/openai/api_requestor.py", line 619, in _interpret_response
    self._interpret_response_line(
  File "/usr/local/lib/python3.10/site-packages/openai/api_requestor.py", line 682, in _interpret_response_line
    raise self.handle_error_response(
openai.error.InvalidRequestError: This model's maximum context length is 8192 tokens. However, you requested 8192 tokens (1144 in the messages, 7048 in the completion). Please reduce the length of the messages or completion.```

fosteman avatar Jul 13 '23 19:07 fosteman

@NeonN3mesis thanks for the log, and darn, I thought we had fixed that component.

@bishi3000 you're absolutely right that we need better error handling, and also state persistence so you don't lose your progress when it happens. We had someone working on general error handling and Sentry integration, but they had to leave the project before finishing it. If you're interested in picking that up, let us know!

It might be a good idea to do some basic validation of the request before invoking it (i.e. checking that the request isn't longer than 4097 tokens)

I see little use in doing that kind of validation when there's no mechanism in place to fix or circumvent the problem. We always try to prevent generating oversized prompts.

Pwuts avatar Jul 13 '23 22:07 Pwuts

Interested in helping out here. @bishi3000 , if you're up, we can meld our brains on discord!

fosteman avatar Jul 14 '23 02:07 fosteman

Yes, happy to be involved

Sent from my iPhone

On 14 Jul 2023, at 03:54, Timothy Fosteman @.***> wrote:



Interested in helping out here. @bishi3000https://github.com/bishi3000 , if you're up, we can meld our brains on discord!

— Reply to this email directly, view it on GitHubhttps://github.com/Significant-Gravitas/Auto-GPT/issues/3892#issuecomment-1635191320, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABECOASJ5WZWNJSANTBPVGLXQCYFPANCNFSM6AAAAAAXX62DWQ. You are receiving this because you were mentioned.Message ID: @.***>

bishi3000 avatar Jul 14 '23 07:07 bishi3000

@Pwuts @fosteman happy to be involved. How do we do this on Discord?

bishi3000 avatar Jul 17 '23 13:07 bishi3000

This error happens also if you are using gpt-3.5-turbo with an agent, sometimes the Action Input is None, and for some reason, it accesses the retriever many times, so you will have Observations duplicated that generates a long text (messages), then INFO:openai:error_code=context_length_exceeded error_message="This model's maximum context length is 4097 tokens. However, you requested 5705 tokens (3705 in the messages, 2000 in the completion). Please reduce the length of the messages or completion." error_param=messages error_type=invalid_request_error message='OpenAI API error received' stream_error=False

Any idea how to solve this error? How can I remove the duplicated Observations? How can avoid that the agent do the retriever many times? How can avoid the agent creates an Action Input != None?

ericbellet avatar Aug 04 '23 08:08 ericbellet

This issue has automatically been marked as stale because it has not had any activity in the last 50 days. You can unstale it by commenting or removing the label. Otherwise, this issue will be closed in 10 days.

github-actions[bot] avatar Nov 10 '23 01:11 github-actions[bot]

Hi, any updates on this issue?

SuryanshuTiwari avatar Nov 16 '23 07:11 SuryanshuTiwari

Afaik this has been fixed a while ago

Pwuts avatar Feb 13 '24 10:02 Pwuts