Open-Assistant Changed the regex in utils.py...

Changed the regex in utils.py - the original was missing references with 2+ ids and spaces after commas. Also changed the way the WebGPT is loaded to use dataset.map() instead of a loop, which should be about 3x faster.

Jan 01 '23 23:01 agoryuno

@theblackcat102 could you have a look?

Jan 02 '23 10:01 yk

@yk looks good, it fixes reference with space in between. Great fix! @agoryuno

Jan 03 '23 12:01 theblackcat102

@agoryuno can you remove these unused import? pre-commit is not happy about it

Jan 03 '23 12:01 theblackcat102

Thank you! @agoryuno could you run pre-commit run --all-files and then commit & push? to make linters happy

Jan 03 '23 14:01 yk

Strange. I struggled with pre-commit for an hour last night and was sure I won ) I'll do it tomorrow. I've refactored it some more, making it run in Colab, as well as from cli. It was a fairly large refactor so I put it in a separate repo for now. It'd be great if @theblackcat102 could take a look at it: https://github.com/agoryuno/instructor

Jan 04 '23 00:01 agoryuno

General feedback about PRs: It's best to create a feature branch for your changes in your own repo, that way we can be sure that the pull request doesn't contain unrelated changes that you've merged into your own default branch.

Jan 05 '23 04:01 bitplane

@agoryuno Could you please resolve the conflict in model/reward/instructor/trainer.py?

Jan 05 '23 08:01 andreaskoepf

Forgot to check if this was merged already before committing changes. Crap!

Jan 05 '23 14:01 agoryuno

I'll reopen a new one with all the changes at once, since I was going to anyway.

Jan 05 '23 14:01 agoryuno

@agoryuno can you push the PR for your current first? Cause we need to run a working reward-model first and the code for training is changing rapidly, so the later the PR the more conflict there is.

Jan 06 '23 13:01 theblackcat102