offsite-tuning Usage of Pile dataset to train the emulator

Usage of Pile dataset to train the emulator

Open ziqi-zhang opened this issue 7 months ago • 1 comments

Hi,

I noticed that you trained the NLP emulator with the first 30 chunks of Pile dataset. I wonder how large are the 30 chunks? Or in other words, how many chunks does Pile have? The original Pile dataset is over 800G, it is too big for the labs...

Besides, did you try to use smaller datasets, such as Wikitext? What is the performance of using these smaller datasets?

Thanks

Dec 14 '23 03:12 ziqi-zhang

offsite-tuning offsite-tuning copied to clipboard

Usage of Pile dataset to train the emulator

offsite-tuning
offsite-tuning copied to clipboard