vega_datasets icon indicating copy to clipboard operation
vega_datasets copied to clipboard

Add more local datasets

Open jakevdp opened this issue 7 years ago • 2 comments

We can add local datasets if

  • the dataset license is compatible with the package MIT license (this is often tough to figure out, because the provenance of many available datastes is unclear)
  • the dataset is small enough that it won't significantly affect the package size

Adding a dataset to the package is easy:

  • add the name to the list at https://github.com/jakevdp/vega_datasets/blob/master/tools/download_datasets.py#L17
  • add dataset description & references (including license if available) to https://github.com/jakevdp/vega_datasets/blob/master/vega_datasets/dataset_info.json
  • run python tools/download_datasets.py
  • commit the downloaded datasets & modified descriptions & open a Pull Request

jakevdp avatar Jan 13 '18 04:01 jakevdp

Does the CC BY 4.0 compatible with MIT? I would like to add some examples based on gapminder or world bank data, both sources under CC BY 4.0 is it possible?

iliatimofeev avatar Mar 24 '19 16:03 iliatimofeev

Hmm... I think so? In any case the datasets need to be part of http://github.com/vega/vega-datasets/ before they can be included here.

jakevdp avatar Mar 24 '19 23:03 jakevdp