Licence of each dataset; licence of the list
Thank you for this great catalog! It would help a lot to have a field for the licence of each dataset, using one of the normalised licence codes. This would complement these very helpful fields :
- availability = Availability of dataset
- registration = Requirements for data access
- free= Free access to data (1 = Yes, 0 = No)
In addition, you may want to specify an open data licence for the list itself (not the R package). (If one is already specified, I failed to see it on the Readme).
Thank you very much for the feedback! I have added a license to the repository and I will add a field to the dataset with the license of each dataset in the future.
I would suggest you use the Identifier of the SPDX License List, as seen at: https://spdx.org/licenses/ Most convenient are the machine readable data files for the SPDX License List. (I am preparing a pull request with R code on this).
Also, in many cases, I expect the data have a written license which is not a standard license (no license name...). For these at least, having a few fields may help, possibly:
- a short description of the licence
- a string for the type of licence (e.g. "attribution")
- a URL to the licence page/the page where hints on the applicable license were found
This is certainly a huge work to review all datasets! I think keeping the URL of the page where hints on the applicable license were found is very helpful (INCLUDING for dataset licensed under an spdx license): it will help review/update the PolData dataset at a later stage. Also, such list of URLs (in a data.frame) would allow to automatically archive (on archive.org) those pages mentioning the license; this could be of legal use (note that the SPDX list refers to archive.org for 18 cases).