hub-docs icon indicating copy to clipboard operation
hub-docs copied to clipboard

[content] copy part of the audio and images datasets doc

Open severo opened this issue 1 year ago • 3 comments

In the "Datasets" section of the Hub docs, we explain how to create different types of datasets. But for the audio and image datasets (https://huggingface.co/docs/hub/datasets-data-files-configuration#image-and-audio-datasets) we just send to the datasets library docs.

The issue is that the https://huggingface.co/docs/datasets/audio_dataset and https://huggingface.co/docs/datasets/image_dataset pages are very long and very specific to the datasets library.

I think that we should extract the relevant "no-code" part to paste into the Hub docs, and refer to the datasets pages only for "advanced setup".

severo avatar Nov 16 '23 16:11 severo

makes sense !

lhoestq avatar Nov 16 '23 20:11 lhoestq

I'll work on it

severo avatar Nov 17 '23 08:11 severo

Also:

  • [x] create a example datasets collection, as https://huggingface.co/collections/datasets-examples/image-dataset-6568e7cf28639db76eb92d65
  • [x] mention the example datasets collections in the docs
  • [ ] do the same for the other formats (text: 1 column called 'text', every line is a row, and the other formats: https://huggingface.co/docs/hub/datasets-adding#file-formats)

severo avatar Jul 23 '24 12:07 severo