lam
lam copied to clipboard
add guidance on adding files and a dataset script
There are various times when you may want to upload files to the hub but have some more control over how the dataset is loaded via a dataset loading script. This can be relevant:
- when the original host may prefer not to have the additional traffic from downloads
- the original data is in a format that is not amenable to streaming/computational access or is very slow to process/load
- the dataset contains complex configuration options
- ??
- When the contributor decides to remove a specific instance from the data
- Some instances in the dataset may have different license than others, e.g. one instance may be in public domain and allowed to use for commercial project, another one isn't