Open-Assistant OA dataset consistent splits #1661

OA dataset consistent splits #1661

Open bethanyconnolly opened this issue 1 year ago • 4 comments

Feb 26 '23 17:02 bethanyconnolly

:x: pre-commit failed. Please run pre-commit run --all-files locally and commit the changes. Find more information in the repository's CONTRIBUTING.md

Feb 26 '23 17:02 github-actions[bot]

Not exactly what we had in mind, can you check again #1661?

Feb 27 '23 08:02 sanagno

Not exactly what we had in mind, can you check again #1661?

Hi, can you please clarify what you had in mind? In #1661 we discussed that the 'sft' and 'reward' datasets needed to be the same split but 'rl' should be different, which I think is what I have done but maybe I have misinterpreted what you wanted? Thanks!

Feb 27 '23 15:02 bethanyconnolly

The reward training uses a different dataset to begin with, see here. Then we need to make sure the splits between these datasets are consistent, in case we want to avoid using the same data for the sft and reward training, which is still up for debate

Mar 02 '23 10:03 sanagno

Open-Assistant Open-Assistant copied to clipboard

OA dataset consistent splits #1661

Open-Assistant
Open-Assistant copied to clipboard