doathon icon indicating copy to clipboard operation
doathon copied to clipboard

Peer Review of Open Research Data

Open InquisitiveVi opened this issue 7 years ago • 5 comments

Confused? New to Github? Visit the GitHub help page on our site for more information!

At a glance

  • Submission Name: Peer Review of Open Research Data

  • Contact Lead: _ @InquisitiveVi Twitter_

  • Region: #Global

  • Issue Area: #OpenData

  • Issue Type: #Challenge

Description

While FAIR principles guide sharing of research data, are there generic attributes that assists decision on the quality of research data shared ? Should research data be peer reviewed after sharing or before collection ? These were the questions we discussed during unconference session. Further we wanted to explore the existing best practices or novel solutions on the peer review of data. This challenge is to document other ideas from OpenCon alumn and community members.

What are we working on during the do-a-thon? What kinds of support do we need?

How can others contribute?

This post is part of the OpenCon 2017 Do-A-Thon. Not sure what's going on? Head here.

InquisitiveVi avatar Nov 13 '17 14:11 InquisitiveVi

For cases where the data is to be reviewed in the context of a manuscript, there are some guidelines here.

Daniel-Mietchen avatar Nov 14 '17 02:11 Daniel-Mietchen

For cases where the dataset is dynamically changing, extra care is needed. A good example of what can be done to facilitate such reviews is here. This basically takes a SPARQL query

SELECT DISTINCT ?q WHERE {
  ?p wdt:P50 ?q;
     wdt:P31 wd:Q13442814 .
  ?q wdt:P21 wd:Q6581072.
}

and some timestamps and provides a list of changes that have been made to Wikidata items about female authors of scientific articles.

Daniel-Mietchen avatar Nov 14 '17 03:11 Daniel-Mietchen

Thank you @Daniel-Mietchen ! I am linking the notes from our unconference session and tagging @chartgerink for feedback. https://docs.google.com/document/d/1DlTOMafXdt2Hgu5A2PiIOdGX4QJ7bRNvyzRZMQSDYVE/edit#heading=h.k44ivrk1hjtt

InquisitiveVi avatar Nov 15 '17 14:11 InquisitiveVi

Thanks @InquisitiveVi for the tag!

My main thing here is at what stage?

  1. Data Management Plans (DMPs) are sort of data reviews pertaining to structure, handling, and storage. 2. Another form of review could be whether the resulting data is what was said was going to be collected (verification of the structure).
  2. Another form could be whether the results are reproducible from the provided data.
  3. Another would be whether the data presented in a manuscript are internally consistent (e.g., with statistical checking tools such as statcheck
  4. What would also be possible is open data review, to indicate FAIR (as indicated in document). Especially whether people can reuse it without making leaps. I've tried to reuse several open data sets that lacked documentation and the authors did receive an "Open Data Badge" (it was clear to the authors I guess)

Sorry if that's incoherent, just dumping some initial thoughts. I think data review is worthwhile, just like code review is valuable. There are many stages at which it could occur though, and where would it be actionable at this moment? I think the focus would be on 5 right now, but I do think in an ideal setting it would be all of these plus more 🔥

chartgerink avatar Nov 17 '17 18:11 chartgerink

Thank you @chartgerink for your thoughts on this. Verification of data structure after initial collection will be very useful but we need to also think about what will encourage the reviewers to get involved and how will data collecting researcher(s) or their team(s) be protected against the regular fear of being scooped. Reuse as a criteria with clear documentation can be a great incentive. Renga platform from Swiss Data Science Centre can be one way to get around reuse of data and workflows https://datascience.ch/renga-platform/

InquisitiveVi avatar Nov 18 '17 13:11 InquisitiveVi