Open-Data-Lab
Open-Data-Lab copied to clipboard
Identify datasets for potential inclusion in the ODL
One way to start looking into this would be to check open resources like
- https://github.com/awesomedata/awesome-public-datasets and see how sustainable/ usable the data are there.
On that basis, we could then decide (see also the inclusion criteria in ODL, as per #18 ) as to whether we'd like to go for datasets scoring high and/or low / average on those scales.
Another potential candidate: http://retractiondatabase.org/ — described by some as "antediluvian".
Another one: https://orcid.org/blog/2018/10/24/2018-public-data-file .
Datasets and code involved in projects for which there is a bug bounty, e.g. https://rubenarslan.github.io/posts/2018-10-26-on-making-mistakes-and-my-bug-bounty-program/ .
allofplos, as per https://github.com/PLOS/allofplos
https://doi.org/10.5061%2Fdryad.n5g39d7 - & mdash; probably the most comprehensive public dataset about Hemimastigophora to date
"Teaching data science with real world datasets" https://twitter.com/emcandre/status/1068139908836012032
Gaia star catalog data, as per http://sci.esa.int/gaia/60192-gaia-creates-richest-star-map-of-our-galaxy-and-beyond/
Here is some inspiration from the kinds of data and related services hosted at IDigInfo's data portal:
- https://idiginfo.org/?q=projects