AutoGPT FIX for API openai.error.InvalidRequestError / maximum content length is [4k/8k] tokens, yours [arbitrary-giant-number] --> RENAME YOUR FILES for processing by the AI internally BEFORE passing the filenames to the AI!

tl;dr:

No matter if you are juggling something downloaded from the web, or files you already have laying around, or files generated by generate_image:

RENAMING THE FILES internally before passing them to the AI will fix the maximum context length error.

Make copies or rename everything to t_0001.txt, t_0002.txt, i_0001.png, etc. - whatever! JUST make SURE the AI does not get to "see" (ingest for processing) more complex filenames than roughly as above!

The CONTENT of the file can be as long as expected re: tokens! 3 files, name 0001.txt 0002.txt 0003.txt, each 500 characters inside - no problem!

But 3-10 (depends) FILENAMES of the kind: my_very_diverse_filename_is_long_so_long.txt with just 50 characters inside? -> Nuked by the API with maximum content length error!

Verbose:

It makes sense, if you think about it; a filename, to the AI, is alike to what a giant IBAN number to make a bank transfer would be for you: Every character and every position matters.

How I found out: I inititally started my <doesn't matter here> with stable-diffusion command line, that just so happens to save files as 00000.png - ascending numbers. This filename would get passed to Auto-GPT to that results in tokens_00000.txt and then, Auto-GPT would read_file that and do something else, which resulted in 00001.png, and so on.

Using GPT-4 @ API, I recently let this run for 42 iterations, and it went flawlessly (I just stopped it at that point because it was $10 and it proved its point for coherence long since).

Now I did the exact same thing with another <doesn't matter> that returns an image. However, it didn't automatically do so. I had to code (I had to make ChatGPT-4 code) that. So I thought I was gonna be smart and had it implement a sanitize_prompt function and then save images as a_very_long_prompt_that_is_long_inspired_by_human_likes.png.

Because that's kinda nicer than 00000.png 00001.png etc. - and well, I would get some insane number of maximum context length, like mine being 37265 -- after less than 5 "loops" / iterations (2-4, depending)! JUST WHY?

Well, because of the filename! Everything else is, from what Auto-GPT-4 has to do and gets to "see", the same as the stable-diffusion stuff.

So now I am saving it internally as that_long_sanitized_prompt_filename.png but make a copy as s_001.png - always overwriting the same file - and only returning that to Auto-GPT:

SYSTEM: Command run_shape returned: SHAPE image saved to file s_001.png

I just had 10 flawless runs without running out of context_length (I just stopped there because of $$$ token-gobbling).

Further on, GPT-4 via ChatGPT-4 coded everything I have implemented so far, including everything above using argparse, subprocess, glob.glob, and many more sophisticated coding AI-hacks -- so:

How do I implement this? ->> Ask GPT-4! I sure as heck wouldn't know how to rename files after they get downloaded / webscraped, for example. But I am sure if you show GPT-4 the code and explain your issue, you will be helped.

So don't ask me ~~ ask your trusty GPT!

Good luck! 👍

May 10 '23 18:05 zer0int

are you using the OpenAI's ada model to embed the tokens?

EMBEDDING_MODEL=text-embedding-ada-002
EMBEDDING_TOKENIZER=cl100k_base
EMBEDDING_TOKEN_LIMIT=8191

May 10 '23 20:05 suparious

text-embedding-ada-002-v2 -- why, did that change with v0.3.0, so the problem everybody else is having (re: maximum content length) with that somewhat glitchy version (my observation - GPT-3.5 went completely bonkers in v0.3.0) is now something completely different? 🤔

If the "filename problem" I mentioned is fixed in v0.3.0, sorry about that - I am still using v0.2.2 because, well, AI were "more weird than weird" in the new version. But this might be because of the stuff I implemented, so nvm!

May 10 '23 23:05 zer0int

I am still seeing this on current master (d3fc8c4); In my case it was right after writint out a short python script it had created on the 3rd command.

SYSTEM:  Command write_to_file returned: File written to successfully.
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
...

Has anyone been able confirm filename length and complexity as the culprit, as I do not seem to get very far without stumbling on to this with current stable or master, either inside docker image, or running in virtualenv on Debian 12.

Does the workspace path come into play? /app/workspace vs /usr/local/src/Auto-GPT-master/autogpt/auto_gpt_workspace/ For kicks I am going to set some file length parameters as a goal for any created python scripts and report back.

May 26 '23 16:05 ianbmacdonald

This issue has automatically been marked as stale because it has not had any activity in the last 50 days. You can unstale it by commenting or removing the label. Otherwise, this issue will be closed in 10 days.

Sep 06 '23 20:09 github-actions[bot]

This issue was closed automatically because it has been stale for 10 days with no activity.

Sep 18 '23 01:09 github-actions[bot]

AutoGPT AutoGPT copied to clipboard

FIX for API openai.error.InvalidRequestError / maximum content length is [4k/8k] tokens, yours [arbitrary-giant-number] --> RENAME YOUR FILES for processing by the AI internally BEFORE passing the filenames to the AI!

tl;dr:

So don't ask me ~~ ask your trusty GPT!

AutoGPT
AutoGPT copied to clipboard