qlora
qlora copied to clipboard
How do you process oasst1 to get 9209 examples
Great work! In your paper you say "In our experiments, we only use the top reply at each level in the conversation tree. This limits the dataset to 9,209 examples. "Could you please tell me how to handle the data? Because I got 10364 examples from 2023-04-12_oasst_ready.trees.jsonl, but I don't know where 9209 came from?
https://huggingface.co/datasets/timdettmers/openassistant-guanaco
This should be the one they are using as oasst1. Looks like there are 9,846 samples.