qlora icon indicating copy to clipboard operation
qlora copied to clipboard

How do you process oasst1 to get 9209 examples

Open iMountTai opened this issue 1 year ago • 1 comments

Great work! In your paper you say "In our experiments, we only use the top reply at each level in the conversation tree. This limits the dataset to 9,209 examples. "Could you please tell me how to handle the data? Because I got 10364 examples from 2023-04-12_oasst_ready.trees.jsonl, but I don't know where 9209 came from?

iMountTai avatar May 25 '23 12:05 iMountTai

https://huggingface.co/datasets/timdettmers/openassistant-guanaco

This should be the one they are using as oasst1. Looks like there are 9,846 samples.

henryzhongsc avatar Jun 10 '23 08:06 henryzhongsc