covid-19-data icon indicating copy to clipboard operation
covid-19-data copied to clipboard

Shift to Wikipedia-compatible license

Open metasj opened this issue 4 years ago • 6 comments

This is important, thank you for posting it. (Do you have related line list data as well?)

Echoing #10 but more specifically: please change the license to a CC license that would let this data be integrated more easily into Wikipedia. WP already has crowdsourced versions of most of this data, with sources, but both projects would benefit from a more thorough synchronization.

The ideal license for data is CC-0, which matches the license of government data (including that in your sources.) That would let this dataset be part of the public knowledge graph (Wikidata), where every revision of each data point can have one or more sources -- important for tracking some of the important nuances here. Please make this happen!

metasj avatar Mar 27 '20 16:03 metasj

I don’t think we need to worry too much about the license: Basic facts can not be copyrighted; there is case law saying no one can copyright this data

samboy avatar Mar 27 '20 18:03 samboy

Yes, and anyone can use individual data points without worrying. There is still uncertainty around the implications for people trying to synchronize larger datasets with one another.

Using a CC-0 license, rather than explicitly using an incompatible license, encourages rather than discourages reuse.

metasj avatar Mar 27 '20 18:03 metasj

I second @metasj regarding those that try to build-upon these values. For example I've started one week ago to re-process and augment the JHU dataset, and now I have integrated the NY Times one as well.

Unfortunately I'm not sure how should I license my derived files, because neither JHU nor NY specify an actual proper license...

cipriancraciun avatar Mar 28 '20 12:03 cipriancraciun

@metasj is right. While data itself can't be copyrighted, datasets can be (as long as there is human involvement in the selection and presentation of the data). It would be really useful if this data were released under a Creative Commons license so that the legal terms were more clear (and perhaps less restrictive than the current terms).

kaldari avatar Mar 30 '20 20:03 kaldari

@metasj today we've updated our README.md to clarify that we consider the LICENSE we posted this under to be co-extensive with and equivalent to the CC BY-NC license. Hope that helps.

As for line list data, we may consider making available some version of what we have (which does not cover demographic data for all cases, especially now with so many more) to qualified scientists who ask for it for a specific research project, but we will not be releasing that publicly.

albertsun avatar Apr 02 '20 15:04 albertsun

Thanks for the update, which I didn't see at the time; that is helpful.

I hope you can set up some embargo process where all non-PII COVID data is released under CC-0 after some period, to match the license of the government data + support future epidemiology.

metasj avatar Oct 16 '20 21:10 metasj