Vedaad Shakib
Results
1
issues of
Vedaad Shakib
Hi, While downloading and processing Dolma v1.7, I noticed that there are many duplicate samples with the same `id` field in the dataset. E.g. in the `Project Gutenberg` source, there...