lam icon indicating copy to clipboard operation
lam copied to clipboard

add guidance on adding files and a dataset script

Open davanstrien opened this issue 2 years ago • 1 comments

There are various times when you may want to upload files to the hub but have some more control over how the dataset is loaded via a dataset loading script. This can be relevant:

  • when the original host may prefer not to have the additional traffic from downloads
  • the original data is in a format that is not amenable to streaming/computational access or is very slow to process/load
  • the dataset contains complex configuration options
  • ??

davanstrien avatar Jul 11 '22 14:07 davanstrien

  • When the contributor decides to remove a specific instance from the data
  • Some instances in the dataset may have different license than others, e.g. one instance may be in public domain and allowed to use for commercial project, another one isn't

Skorkmaz88 avatar Jul 14 '22 16:07 Skorkmaz88