Open-Assistant
Open-Assistant copied to clipboard
Add RankGen Classification
add a demo for RankGen Classification as proposed in: https://github.com/LAION-AI/Open-Assistant/issues/382#issue-1519347873
Doesn't use https://github.com/anthropics/hh-rlhf just yet.
This looks great!
@SummerSigh interesting take! I know it's just a proof of concept idea, but could you change the notebook directory to harmful-filtering / negative-question-filtering ?
Here's another suggestion change:
my concern was how accurate or precise this filtering method? Can you do a simple train-test split and evaluate it? ( in this scenario you probably just use the train split for generating the reference embedding )
Sure! I'll add a test set to the dataset! My one concern for changing the name of the directory is that it seems that there is already another attempt at data cleaning using at detoxify-evaluation. If we do establish one method or the other as the main one we are going to use, we can then change the name of the folder. Also, in the next version of this POC, I will use the hh-RLHF dataset which I can also do evaluation on.
thank you very much! could you run pre-commit run --all-files to make the linters happy?
@yk It seems pre-commit (on github actions) continues to fail without any discernable error. I think this is because it's a notebook and something weird is going on. I can't run it locally due to computer issues.
black-jupyter (part of pre-commit) also formats notebooks. I've run it for you, can you apply the attached patch? pr385-pre-commit.patch
:x: pre-commit failed.
Please run pre-commit run --all-files locally and commit the changes.
Find more information in the repository's CONTRIBUTING.md
Closing Pull due to large updates in the safety pipeline.