AutoGPT
AutoGPT copied to clipboard
Fix the maximum context length issue by chunking
Background
Multiple issues opened about the same issue, e.g. #2801 #2871 #2906 and more, which multiple commands calls memory.add() which then calls create_embedding_with_ada, in the cases where the input text exceeds the model's 8191 token limit, we will get an InvalidRequestError saying that "This model's maximum context length is 8191 tokens...".
Resolves #2801, resolves #2871, resolves #2906, resolves #3244
Changes
The issue is fixed by chunking the input text, then running embedding individually and then combining by weighted averaging. This approach is suggested by the OpenAI. This change model after OpenAI Cookbook. This PR should fix numbers of open issues including the ones mentioned above and more.
PR Quality Checklist
- [x] My pull request is atomic and focuses on a single change.
- [x] I have thoroughly tested my changes with multiple different prompts.
- [x] I have considered potential risks and mitigations for my changes.
- [x] I have documented my changes clearly and comprehensively.
- [x] I have not snuck in any "extra" small tweaks changes
Codecov Report
Patch coverage: 86.48% and project coverage change: +0.24 :tada:
Comparison is base (
0ef6f06) 60.31% compared to head (572cac9) 60.55%.
Additional details and impacted files
@@ Coverage Diff @@
## master #3222 +/- ##
==========================================
+ Coverage 60.31% 60.55% +0.24%
==========================================
Files 69 69
Lines 3152 3184 +32
Branches 525 528 +3
==========================================
+ Hits 1901 1928 +27
- Misses 1118 1122 +4
- Partials 133 134 +1
| Impacted Files | Coverage Δ | |
|---|---|---|
| autogpt/llm/__init__.py | 100.00% <ø> (ø) |
|
| autogpt/llm/modelsinfo.py | 100.00% <ø> (ø) |
|
| autogpt/config/config.py | 76.25% <66.66%> (-0.58%) |
:arrow_down: |
| autogpt/llm/llm_utils.py | 66.66% <92.85%> (+5.34%) |
:arrow_up: |
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
@Pwuts I think this change can fix and close multiple open issues. Could you please review, approve and merge?
Please link issues if this PR resolves them
Also, this is missing test coverage. Can you fix that (using pytest, not unittest)?
This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request.
endless crashes since 4 days a lot but happening less often since 2-3 weeks. Crashed 4 times in a row and constantly for 3 hours every restart. Here is some code to cap the max length for GPT3.5t because max is about 8191 tokens so to be save under 24000 seems to be fine most of the time here the code:
https://github.com/Significant-Gravitas/Auto-GPT/discussions/3239#discussion-5130661
I've written code that uses Numpty to count characters, and spacy to count tokens--count characters to create blocks of text that mix down to the number of tokens for an embedding. Submit the embedding, add to the total cost, etc. When all embeddings are done, average and return the combined embedding.
I DO NOT know if this is a great approach because I am not sure if I should be averaging, or combining for a much beefier vector.
The latest updates on your projects. Learn more about Vercel for Git ↗︎
1 Ignored Deployment
| Name | Status | Preview | Comments | Updated (UTC) |
|---|---|---|---|---|
| docs | ⬜️ Ignored (Inspect) | Visit Preview | May 1, 2023 6:06pm |
Conflicts have been resolved! 🎉 A maintainer will review the pull request shortly.
Linked the issues that this PR is going to fix and added a unit test for the new chunk token func
This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request.
Conflicts have been resolved! 🎉 A maintainer will review the pull request shortly.
Merging vectors it is!
@Pwuts thanks very much for the review. Though weighted average is not the most effective averaging, it's still effective in many ways and better than simply truncating. The tradeoff if we don't do any averaging now is the auto-gpt crashes and loses all the works, which is something we don't want either. I suggest a two-step approach to averaging techniques. First, we can use the weighted average technique to avoid crashes and continue progress. Then, we can implement the more advanced partition averaging technique, which requires clustering or learning techniques, e.g. "Sparse Dictionary learning" used by the quoted paper to group similar sentences together. To stay organized, we can merge this PR, close related issues and open a new one to track progress. This approach will balance our concerns and allow us to make progress effectively.
Can you show that this performs? Based on my limited knowledge I'm not convinced that taking a weighted average will be effective for large amounts of content.
I pushed code last night into the associated branch that concatenates rather than averages.
This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request.
@sidewaysthought what associated branch? There is no open PR from you other than the multi-byte one.
https://github.com/sidewaysthought/Auto-GPT/tree/read_file-fix-character-length-%233222
@Pwuts thanks. This weighted average approach is modeled after this cookbook by OpenAI. I'd assume this is a good start for solving this problem. But surely I can do more research on how it compares with other approaches, and enhance this part later. That's why I suggest we merge this PR first and can open a new issue to track.
Conflicts have been resolved! 🎉 A maintainer will review the pull request shortly.
@sidewaysthought we can't process your contribution if you don't create a PR
@sidewaysthought we can't process your contribution if you don't create a PR
This was the PR https://github.com/Significant-Gravitas/Auto-GPT/pull/3262
This PR exceeds the recommended size of 200 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size
This PR exceeds the recommended size of 200 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size
This PR exceeds the recommended size of 200 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size
This PR exceeds the recommended size of 200 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size
This PR exceeds the recommended size of 200 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size
This PR exceeds the recommended size of 200 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size
This PR exceeds the recommended size of 200 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size