AutoGPT Fix the maximum context length issue by chunking

Background

Multiple issues opened about the same issue, e.g. #2801 #2871 #2906 and more, which multiple commands calls memory.add() which then calls create_embedding_with_ada, in the cases where the input text exceeds the model's 8191 token limit, we will get an InvalidRequestError saying that "This model's maximum context length is 8191 tokens...".

Resolves #2801, resolves #2871, resolves #2906, resolves #3244

Changes

The issue is fixed by chunking the input text, then running embedding individually and then combining by weighted averaging. This approach is suggested by the OpenAI. This change model after OpenAI Cookbook. This PR should fix numbers of open issues including the ones mentioned above and more.

PR Quality Checklist

[x] My pull request is atomic and focuses on a single change.
[x] I have thoroughly tested my changes with multiple different prompts.
[x] I have considered potential risks and mitigations for my changes.
[x] I have documented my changes clearly and comprehensively.
[x] I have not snuck in any "extra" small tweaks changes

Apr 25 '23 15:04 kinance

Codecov Report

Patch coverage: 86.48% and project coverage change: +0.24 :tada:

Comparison is base (0ef6f06) 60.31% compared to head (572cac9) 60.55%.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #3222      +/-   ##
==========================================
+ Coverage   60.31%   60.55%   +0.24%     
==========================================
  Files          69       69              
  Lines        3152     3184      +32     
  Branches      525      528       +3     
==========================================
+ Hits         1901     1928      +27     
- Misses       1118     1122       +4     
- Partials      133      134       +1

Impacted Files	Coverage Δ
autogpt/llm/__init__.py	`100.00% <ø> (ø)`
autogpt/llm/modelsinfo.py	`100.00% <ø> (ø)`
autogpt/config/config.py	`76.25% <66.66%> (-0.58%)`	:arrow_down:
autogpt/llm/llm_utils.py	`66.66% <92.85%> (+5.34%)`	:arrow_up:

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

Apr 25 '23 15:04 codecov[bot]

@Pwuts I think this change can fix and close multiple open issues. Could you please review, approve and merge?

Apr 25 '23 15:04 kinance

Please link issues if this PR resolves them

Apr 25 '23 17:04 Pwuts

Also, this is missing test coverage. Can you fix that (using pytest, not unittest)?

Apr 25 '23 17:04 Pwuts

This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request.

Apr 25 '23 18:04 github-actions[bot]

endless crashes since 4 days a lot but happening less often since 2-3 weeks. Crashed 4 times in a row and constantly for 3 hours every restart. Here is some code to cap the max length for GPT3.5t because max is about 8191 tokens so to be save under 24000 seems to be fine most of the time here the code:

https://github.com/Significant-Gravitas/Auto-GPT/discussions/3239#discussion-5130661

Apr 25 '23 19:04 GoMightyAlgorythmGo

I've written code that uses Numpty to count characters, and spacy to count tokens--count characters to create blocks of text that mix down to the number of tokens for an embedding. Submit the embedding, add to the total cost, etc. When all embeddings are done, average and return the combined embedding.

I DO NOT know if this is a great approach because I am not sure if I should be averaging, or combining for a much beefier vector.

Apr 25 '23 20:04 sidewaysthought

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment

Name	Status	Preview	Comments	Updated (UTC)
docs	⬜️ Ignored (Inspect)	Visit Preview		May 1, 2023 6:06pm

Apr 26 '23 14:04 vercel[bot]

Conflicts have been resolved! 🎉 A maintainer will review the pull request shortly.

Apr 26 '23 15:04 github-actions[bot]

Linked the issues that this PR is going to fix and added a unit test for the new chunk token func

Apr 26 '23 15:04 kinance

This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request.

Apr 26 '23 16:04 github-actions[bot]

Conflicts have been resolved! 🎉 A maintainer will review the pull request shortly.

Apr 27 '23 14:04 github-actions[bot]

Merging vectors it is!

Apr 27 '23 20:04 sidewaysthought

@Pwuts thanks very much for the review. Though weighted average is not the most effective averaging, it's still effective in many ways and better than simply truncating. The tradeoff if we don't do any averaging now is the auto-gpt crashes and loses all the works, which is something we don't want either. I suggest a two-step approach to averaging techniques. First, we can use the weighted average technique to avoid crashes and continue progress. Then, we can implement the more advanced partition averaging technique, which requires clustering or learning techniques, e.g. "Sparse Dictionary learning" used by the quoted paper to group similar sentences together. To stay organized, we can merge this PR, close related issues and open a new one to track progress. This approach will balance our concerns and allow us to make progress effectively.

Apr 27 '23 21:04 kinance

Can you show that this performs? Based on my limited knowledge I'm not convinced that taking a weighted average will be effective for large amounts of content.

Apr 28 '23 15:04 Pwuts

I pushed code last night into the associated branch that concatenates rather than averages.

Apr 28 '23 17:04 sidewaysthought

This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request.

Apr 28 '23 19:04 github-actions[bot]

@sidewaysthought what associated branch? There is no open PR from you other than the multi-byte one.

Apr 28 '23 19:04 Pwuts

https://github.com/sidewaysthought/Auto-GPT/tree/read_file-fix-character-length-%233222

Apr 28 '23 19:04 sidewaysthought

@Pwuts thanks. This weighted average approach is modeled after this cookbook by OpenAI. I'd assume this is a good start for solving this problem. But surely I can do more research on how it compares with other approaches, and enhance this part later. That's why I suggest we merge this PR first and can open a new issue to track.

Apr 28 '23 23:04 kinance

Conflicts have been resolved! 🎉 A maintainer will review the pull request shortly.

Apr 29 '23 01:04 github-actions[bot]

@sidewaysthought we can't process your contribution if you don't create a PR

Apr 29 '23 15:04 Pwuts

@sidewaysthought we can't process your contribution if you don't create a PR

This was the PR https://github.com/Significant-Gravitas/Auto-GPT/pull/3262

May 01 '23 04:05 sidewaysthought