Open-Assistant icon indicating copy to clipboard operation
Open-Assistant copied to clipboard

OA dataset consistent splits

Open sanagno opened this issue 2 years ago • 9 comments

We need to make sure that different splits of the dataset are used for sft, reward and rl training.

Basically sft_dataset and reward datasets needs to use the same splits.

sanagno avatar Feb 17 '23 12:02 sanagno

Set a same global seed maybe?

theblackcat102 avatar Feb 19 '23 09:02 theblackcat102

I have a random generator with specific seed for the dataset only https://github.com/LAION-AI/Open-Assistant/blob/main/model/model_training/custom_datasets/prompt_dialogue.py#L31

sanagno avatar Feb 19 '23 10:02 sanagno

Hi! I am an industry research scientist and I am interested in contributing to these problems. Am I correct in my understanding that we want the dataset splits 'sft' and 'reward_model' to be identical while 'rl' is a different split. If so I think I can solve this problem by creating a fixed order of indexs using the random state and then using the same slice of indexs for 'sft' and 'reward_model' but a different slice for 'rl' . With this update making 'sft' and 'reward_model' the same split, the split sizes will need to be changed too so that all the data is used.

bethanyconnolly avatar Feb 20 '23 17:02 bethanyconnolly

hey @bethanyconnolly yes feel free to do this. Btw @theblackcat102 if you are doing any special preprocessing please push the script, I am missing the history tags.

sanagno avatar Feb 20 '23 20:02 sanagno

I think this was not meant to be closed by #1776 so I am reopening

olliestanley avatar Feb 22 '23 09:02 olliestanley

@sanagno by history tag, are you referring to the reward model? or the RLHF training?

theblackcat102 avatar Feb 28 '23 00:02 theblackcat102

I basically mean this https://github.com/LAION-AI/Open-Assistant/blob/main/model/reward/instructor/rank_datasets.py#L323.

sanagno avatar Feb 28 '23 08:02 sanagno

@sanagno sorry that’s on my fault, it’s a quick hack to include historical conversation into the question section ( first sentence ). Should we use the question, answer tag instead?

theblackcat102 avatar Mar 01 '23 08:03 theblackcat102

Nono its fine, we can just add it to the dataset init or just the README to be easy to follow the instructions to recreate it

sanagno avatar Mar 01 '23 18:03 sanagno

Hi, I am feeling a bit confused about how to handle this issue now. Perhaps it can be unassigned and given to someone else with better context on the problem? Thanks

bethanyconnolly avatar Mar 10 '23 16:03 bethanyconnolly