Open-Assistant OA dataset consistent splits

We need to make sure that different splits of the dataset are used for sft, reward and rl training.

Basically sft_dataset and reward datasets needs to use the same splits.

Feb 17 '23 12:02 sanagno

Set a same global seed maybe?

Feb 19 '23 09:02 theblackcat102

I have a random generator with specific seed for the dataset only https://github.com/LAION-AI/Open-Assistant/blob/main/model/model_training/custom_datasets/prompt_dialogue.py#L31

Feb 19 '23 10:02 sanagno

Hi! I am an industry research scientist and I am interested in contributing to these problems. Am I correct in my understanding that we want the dataset splits 'sft' and 'reward_model' to be identical while 'rl' is a different split. If so I think I can solve this problem by creating a fixed order of indexs using the random state and then using the same slice of indexs for 'sft' and 'reward_model' but a different slice for 'rl' . With this update making 'sft' and 'reward_model' the same split, the split sizes will need to be changed too so that all the data is used.

Feb 20 '23 17:02 bethanyconnolly

hey @bethanyconnolly yes feel free to do this. Btw @theblackcat102 if you are doing any special preprocessing please push the script, I am missing the history tags.

Feb 20 '23 20:02 sanagno

I think this was not meant to be closed by #1776 so I am reopening

Feb 22 '23 09:02 olliestanley

@sanagno by history tag, are you referring to the reward model? or the RLHF training?

Feb 28 '23 00:02 theblackcat102

I basically mean this https://github.com/LAION-AI/Open-Assistant/blob/main/model/reward/instructor/rank_datasets.py#L323.

Feb 28 '23 08:02 sanagno

@sanagno sorry that’s on my fault, it’s a quick hack to include historical conversation into the question section ( first sentence ). Should we use the question, answer tag instead?

Mar 01 '23 08:03 theblackcat102

Nono its fine, we can just add it to the dataset init or just the README to be easy to follow the instructions to recreate it

Mar 01 '23 18:03 sanagno

Hi, I am feeling a bit confused about how to handle this issue now. Perhaps it can be unassigned and given to someone else with better context on the problem? Thanks

Mar 10 '23 16:03 bethanyconnolly

Open-Assistant Open-Assistant copied to clipboard

OA dataset consistent splits

Open-Assistant
Open-Assistant copied to clipboard