ReCiter icon indicating copy to clipboard operation
ReCiter copied to clipboard

Update the way ReCiter handles books

Open paulalbert1 opened this issue 1 year ago • 1 comments

Scope

Approximately 0.1% of records in PubMed are for books although this has increased in the past year.

Screenshot 2023-10-18 at 10 43 34 AM

Data model

Books have a different data model. Key differences include:

Description Book XML Attribute Journal Article XML Attribute
Publication Type <PublicationType>Book [Chapter]</PublicationType> <PublicationType>Journal Article</PublicationType>
Source Title <BookTitle> <JournalTitle>
Identifier (ISBN/ISSN) <ISBN> <ISSN>
Publisher <Publisher> N/A
Publication Place <PlaceOfPublication> N/A
Authors <AuthorList><Author>...</Author></AuthorList> same
Editors (for books) <EditorList><Editor>...</Editor></EditorList> N/A
Pagination <PageRange> (especially for chapters) <MedlinePgn>
Publication Frequency N/A Could be inferred from <JournalIssue><PubFrequency>
DOI <ELocationID EIdType="doi">...</ELocationID> same
Abstract <AbstractText>...</AbstractText> (sometimes omitted) same

Effect

The inconsistent data model causes chaos. For example, for personIdentifier = tme2002 and PMID = 34818336 (see also API), the wrong authors are listed. What probably is occurring is that the author list if shifting by one.

Screenshot 2023-10-18 at 10 37 35 AM Screenshot 2023-10-18 at 10 34 20 AM

Another example: mtoth and 21204454: https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&id=21204454&retmode=xml

Options

  1. Update ReCiter PubMed Retrieval Tool not to return books. We could do this like so: cole c[au] NOT (booksdocs[Filter])

  2. Update data model across the projects to handle books:

  • ReCiter PubMed Retrieval Tool
  • ReCiter
  • ReCiterDB
  • ReCiter Publication Manager
  1. Exclude books from ReCiter Feature Generator and Article Retrieval output.
  • Approach 1: Exclude cases where PublicationType = Book [Chapter], or
  • Approach 2: Require JournalTitle attribute
  • Include a flag in application.properties to exclude books

paulalbert1 avatar Oct 18 '23 15:10 paulalbert1

I'm not sure this is still an issue. Screenshot 2023-10-22 at 12 00 35 PM

Screenshot 2023-10-22 at 12 01 07 PM

paulalbert1 avatar Oct 22 '23 16:10 paulalbert1