evals Dataset hosting, data cards and previews

Hi, I'm Quentin from Hugging Face :)

I know hosting datasets on github is not always practical: git lfs required, no data preview, limited storage (maybe not for you haha), no standard for data documentation. So I was wondering:

Have you considered host alternatives more suited for datasets, and would let researchers explore the datasets of evals ?

This way researchers can know in depth what data is used for evaluation and their goals and limitations, in particular to better understand what domains and structures their models perform good or bad at.

e.g. the Hugging Face datasets hub shows data cards for documentation and previews for each dataset. Also loading and caching a dataset is one line of python, saving you from wget and github hosting. It also supports pull requests for the community to contribute.

It can even allow to use those datasets in other well known eval frameworks, such as lm-evaluation-harness.

Let me know what you think !

Mar 15 '23 11:03 lhoestq

I love hugging face! Worth considering at some point soon, will explore internally over the next few weeks.

Mar 20 '23 21:03 logankilpatrick

Thanks for stopping by! HuggingFace datasets is great.

Many of our evals are only a few samples long (10-20), which we were worried to be too small to host as individual datasets on Hugging Face Datasets. We needed a platform to support lots of small datasets which is why LFS seemed to work OK for our task.

If you think this is still a reasonable use case for HuggingFace Datasets, I'd be happy to help any efforts in mirroring them onto HuggingFace!

Mar 30 '23 00:03 andrew-openai

@logankilpatrick @andrew-openai hi, I'm Polina from HF datasets team :) regarding your worrying about many small datasets - in datasets it's possible to host more then one dataset as a single dataset with many subsets, like it's done for benchmarks like glue.
Also feel free to ping me if you have any questions about adding datasets to the Hub :)

Mar 30 '23 12:03 polinaeterna

Has any further consideration taken place over a Hugging Face dataset mirror?

Jun 06 '23 18:06 EwoutH