AutoGPT
AutoGPT copied to clipboard
reward system for autogpt
Duplicates
- [X] I have searched the existing issues
Summary 💡
Introduce a reward system so that AutoGPT can also learn. You should be able to see all actions in a history. Possibly even be able to call up parameters and then evaluate them with positive "p" and "n" negative. This evaluation file could then be shared as an extension among different users or in the end even played back to openAi for learning.
Examples 🌈
No response
Motivation 🔦
To improve the software by crowd learning. The basics of ChatGPT
Please note that GPT-4 can only do in-context learning, as the API does not currently support fine-tuning the model.
this is more important than people may think, you need this for any sort of fitness function / training purposes - regardless of whether the LLM supports this or not, the reward system could also be executed locally to self-optimize: https://github.com/Significant-Gravitas/Auto-GPT/pull/3868#issuecomment-1539976726
And all actions/commands have a certain cost associated with it. So in general, all actions/commands need to expose their costs so that a local reward function can optimize for those:
- #4042
I'm all for on-policy reinforcement learning, see the paper LETI: Learning to Generate from Textual Interactions https://arxiv.org/abs/2305.10314
This issue has automatically been marked as stale because it has not had any activity in the last 50 days. You can unstale it by commenting or removing the label. Otherwise, this issue will be closed in 10 days.
This issue was closed automatically because it has been stale for 10 days with no activity.