ideas icon indicating copy to clipboard operation
ideas copied to clipboard

Mine PMC for ethics statements

Open Daniel-Mietchen opened this issue 8 years ago • 17 comments

possible search terms:

  • ethical
  • "institutional review board"
  • "informed consent" etc.

Daniel-Mietchen avatar Oct 02 '17 20:10 Daniel-Mietchen

The main purpose here would be to see

  • what percentage of articles have a dedicated ethics section, and how that changes over time
  • what kind of information is provided in addition to statements of the "... received ethical approval" and "gave informed consent" kinds.
  • to what extent PIDs are being used in there and for what, and how that changes over time.

Daniel-Mietchen avatar Oct 02 '17 20:10 Daniel-Mietchen

A simple query for "approval number" currently yields 11404 hits: https://www.ncbi.nlm.nih.gov/pmc/?term=%22approval+number%22

Daniel-Mietchen avatar Oct 13 '17 02:10 Daniel-Mietchen

Just to clarify that conflict of interest statements are within scope here as well.

Daniel-Mietchen avatar Dec 10 '18 15:12 Daniel-Mietchen

I just reran that "approval number" query from Oct 12, 2017, and it now yields 37963 results, i.e. an about 3.5-fold increase in about 3.5 years.

In the meantime, I have begun to collaborate with @petermr, and we are trying to use his ContentMine pipeline (which is currently being ported to Python) to extract ethics statements from PMC. On the way, we have built a first — still very rough — dictionary (i.e. a set of words highly indicative of the topic of ethics statements), and we are trying to also get a list of ethics committees mentioned in PMC-indexed papers.

Daniel-Mietchen avatar Apr 28 '21 22:04 Daniel-Mietchen

Meeting on April 29, 2021:

  • We are considering to submit something to Wikidata Workshop
  • We are also considering to submit a Research Idea to RIO and a research paper as well, perhaps in WikiJournal
  • There is an event being planned for a weekend in May that is about introducing people to Wikidata in a playful manner. Peter will think about aligning it with the Wikimedia Hackathon
  • We also looked a bit into ContentMine dictionaries.

Daniel-Mietchen avatar Apr 29 '21 13:04 Daniel-Mietchen

Some more notes on this by @ShweataNHegde sit at https://github.com/petermr/dictionary/wiki/Ethics-Statement-Project .

Daniel-Mietchen avatar May 10 '21 13:05 Daniel-Mietchen

A search for "approval number" now gives 38437 results, i.e. about 500 more than just two weeks ago.

Daniel-Mietchen avatar May 10 '21 14:05 Daniel-Mietchen

There are ambiguities at multiple levels.

For instance, this article states that

This study was approved by the Johns Hopkins School of Medicine IRB, Approval Number: IRB00151734. 

The problem here is that Johns Hopkins School of Medicine runs multiple IRBs, and there does not seem to be a straightforward mechanisms to resolve the approval number to get more metadata about the process.

Daniel-Mietchen avatar May 10 '21 14:05 Daniel-Mietchen

I have started to test the phrase extraction tool NLTK-RAKE. https://towardsdatascience.com/extracting-keyphrases-from-text-rake-and-gensim-in-python-eefd0fad582f As with all language tools it will take a day or two to see how useful it is.

On Mon, May 10, 2021 at 4:03 PM Daniel Mietchen @.***> wrote:

There is a Office for Human Research Protections (OHRP) Database for Registered IORGs & IRBs, Approved FWAs, and Documents Received in Last 60 Days https://ohrp.cit.nih.gov/search/irbsearch.aspx that has identifiers for IRBs, but these do not resolve either.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Daniel-Mietchen/ideas/issues/499#issuecomment-836810063, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFTCSYDBWTBXH7DUKKFQMLTM7YUNANCNFSM4D5M32KA .

-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK

petermr avatar May 10 '21 16:05 petermr

https://colab.research.google.com/drive/1sFj07mE2XRyeaplvsTs34-VaDHBjnt6U?usp=sharing

Ayush (openVirus volunteers) and I wrote a piece of code that can extract common phrases from a text file with manually scraped Ethics Statements.

ShweataNHegde avatar May 20 '21 13:05 ShweataNHegde

Some updates from this week:

  • @ShweataNHegde has created a more refined ethics dictionary here, as per these notes
  • I have created Wikidata lexemes for most of the entries in her dictionary, as per this overview
  • I also started WikiProject Ethics.

Daniel-Mietchen avatar Jun 03 '21 20:06 Daniel-Mietchen

For more recent updates, see the notes over at Shweata's page.

Daniel-Mietchen avatar Jun 19 '21 23:06 Daniel-Mietchen

Here is a list of ethics-related entities Shweata has mined from articles on stem cells.

Daniel-Mietchen avatar Jun 24 '21 21:06 Daniel-Mietchen

Some more observations by Shweata and Peter sit here.

We now have a dedicated organization, repo and wiki:

  • https://github.com/FAIR-ethics
  • https://github.com/FAIR-ethics/PMC-ethics
  • https://github.com/FAIR-ethics/PMC-ethics/wiki

Daniel-Mietchen avatar Jul 05 '21 00:07 Daniel-Mietchen

The paper How does nursing research differ internationally? A bibliometric analysis of six countries. has a Table 1 that looks at certain features of previous studies, including

Extracted specific properties (e.g., contains ethics statements)

Daniel-Mietchen avatar Oct 10 '21 04:10 Daniel-Mietchen

The project with Shweata and Peter (and Ayush) has since led to a publication:

Hegde SN, Garg A, Murray-Rust P, Mietchen D (2022) Mining the literature for ethics statements: A step towards standardizing research ethics. Research Ideas and Outcomes 8: e94685. https://doi.org/10.3897/rio.8.e94685 .

It outlines a workflow for mining ethics statements and discusses motivations, applications and complications.

Daniel-Mietchen avatar Feb 02 '23 22:02 Daniel-Mietchen