FlagEmbedding
FlagEmbedding copied to clipboard
Dataset used in LongLLM_QLORA
Hi:
I wonder what's the base/source dataset used to create the following dataset
- bio_book
- one_details_book
- multi_details_book
- multi_details_paper_long
- one_detail_paper_long
Thanks!
Best,
Mingyi
Hi,
- All books are from books3 subset of the Pile.
- All papers are from arxiv subset of the Pile.