AutoGPT icon indicating copy to clipboard operation
AutoGPT copied to clipboard

reward system for autogpt

Open horazius opened this issue 2 years ago • 2 comments

Duplicates

  • [X] I have searched the existing issues

Summary 💡

Introduce a reward system so that AutoGPT can also learn. You should be able to see all actions in a history. Possibly even be able to call up parameters and then evaluate them with positive "p" and "n" negative. This evaluation file could then be shared as an extension among different users or in the end even played back to openAi for learning.

Examples 🌈

No response

Motivation 🔦

To improve the software by crowd learning. The basics of ChatGPT

horazius avatar Apr 18 '23 21:04 horazius

Please note that GPT-4 can only do in-context learning, as the API does not currently support fine-tuning the model.

Androbin avatar Apr 19 '23 00:04 Androbin

this is more important than people may think, you need this for any sort of fitness function / training purposes - regardless of whether the LLM supports this or not, the reward system could also be executed locally to self-optimize: https://github.com/Significant-Gravitas/Auto-GPT/pull/3868#issuecomment-1539976726

And all actions/commands have a certain cost associated with it. So in general, all actions/commands need to expose their costs so that a local reward function can optimize for those:

  • #4042

Boostrix avatar May 09 '23 06:05 Boostrix

I'm all for on-policy reinforcement learning, see the paper LETI: Learning to Generate from Textual Interactions https://arxiv.org/abs/2305.10314

Androbin avatar May 22 '23 17:05 Androbin

This issue has automatically been marked as stale because it has not had any activity in the last 50 days. You can unstale it by commenting or removing the label. Otherwise, this issue will be closed in 10 days.

github-actions[bot] avatar Sep 06 '23 21:09 github-actions[bot]

This issue was closed automatically because it has been stale for 10 days with no activity.

github-actions[bot] avatar Sep 17 '23 01:09 github-actions[bot]