James Brown

Results 154 comments of James Brown

What is your 11b reward model? Seems rare to find one. #146

Could I ask what reward models you are using? Seems rare to find one. > We need the ability to use massive reward models

> FLAN T5 11B I've reviewed [your code](https://github.com/LouisCastricato/limited-data-scaling-laws/blob/79fd6fcb452d9b513d9520c144a447d5a11ccecd/critic_models.py#L55) (or you have modified it somehow). I think the prompt format needs change to adapt multiline answers. ```python # give prompt and...

In addition to the parameter allowing passing prompt-dependent reward function, I'm more interested in the way of implementing the reward function. How is it done, to efficiently train a reward...

To handle such wide range of tasks which ChatGPT is capable of, the reward function is either multimodal (merged with multiple metrics and models, fine-tuned according to human preference) or...

For such assistant to become usable, it shall be both real-time and always-learning. Real-time means it will listen to the user, monitor feedback from computer and react accordingly. Always-learning means...

Now it is been partly implemented, and as part of my ideology, the project [Cybergod](https://github.com/james4ever0/agi_computer_control) has been released. Here's the program in action: https://github.com/Significant-Gravitas/Auto-GPT/assets/103997068/8e1cd6fe-c49d-4d2b-835d-0ffc9a5a458e If anyone interested in Cybergod, please...

原因是作者的配置文件写的不对 action没有正确执行

[尝试修复](https://github.com/fate0/proxylist/pull/4)

This repo has "[python3](https://github.com/citronneur/rdpy/tree/python3)" branch. Checkout that. I know nothing about this library, so my name-only replacements do not work as expected, as you have observed from my fork [rdpy3](https://github.com/James4Ever0/rdpy3)....