content icon indicating copy to clipboard operation
content copied to clipboard

What does spread DOIs do?

Open hansioan opened this issue 4 years ago • 3 comments

@bgruening @OlegZharkov

Excuse my ignorance, but I get emails from github-actions bot about spread dois. I know @OlegZharkov worked on this at the Freiburg hackathon, but I see it looks at bioconda yaml files, debian files and also bio.tools json files. What does it do with the bio.tools json files? Does it add DOIs where they are not available or what exactly?

Thanks

hansioan avatar Apr 08 '20 20:04 hansioan

I am also asking because I randomly looked at: https://github.com/bio-tools/content/pull/208

at the file: data/acde/acde.json and I saw that the change in that file was that, in addition to the existing publication object: { "doi": "10.1007/978-0-387-49317-6_9", "type": "Other" }

it also added another publication object: { "doi": "10.1007/978-0-387-49317-6_9", "type": "Primary" }

with the same DOI as the existing one except it changed the type to Primary.
This is actually wrong because it adds the same publication as type Primary which is this case is not correct because that publication has a type "Other" and not Primary for a reason, because it's not the primary publication.

I also see that in the case of the file data/affy/affy.json it adds a new publication which is actually nice.

hansioan avatar Apr 08 '20 20:04 hansioan

@hansioan yeah, that's probably a bug, because doi collector considers only dois of type primary. Should I make it consider every type (apart of primary) or just check, if it exist in other types?

OlegZharkov avatar Apr 09 '20 14:04 OlegZharkov

@OlegZharkov

I am not sure exactly what the code does, I haven't look at it too much to be honest, but I assume it tries to populate DOIs around all the bio.tools (you call it json), bioconda and debian files, if a DOI is present in the bioconda file and not in the bio.tools file then it adds that doi to the bio.tools file as well.

If what I assume is correct then I would say it like this, and this is just for bio.tools, I can't speak for the others. In bio.tools you should not add type Primary to the publication if it's missing or if the type is some other type unless you actually know it's primary. The only way (in my opinion) to know if it's primary is if it exists in the bioconda and debian as well. Then you can kinda safely assume it's primary. I would look at the following scenarios:

  • If the bio.tools json doesn't have a publication or has a publication with a different DOI, but the bioconda and/or debian files have a different DOI, then you can take that different doi from the other files and create a publication object with the different DOI and type = Primary.
  • If the bio.tools json has a DOI for publication of type other than Primary, but that same DOI also is present in bioconda and/or debian then I would say you could change the type of the publication in bio.tools to Primary (if initially it wasn't Primary).
  • If a DOI in a publication in bio.tools does NOT appear in bioconda or debian, then I think you cannot assume anything and you cannot update that publication type in any way.

The other way around (this is my opinion and maybe @bgruening doesn't agree):

  • If bio.tools has DOI for a publication of type Primary and the bioconda and/or debian do not have that DOI then I would say it's safe to add that DOI to bioconda and/or debian.

  • This last one is hard as it would require looking things up online, but bio.tools also supports publication IDs other than DOI, such as PubMed id or PubMed Central (PMC) id. If there is some sort of service that provides the DOI given a PubMed id or PMC id then you can infer the DOI and apply the same logic above based on the type of the publication.

hansioan avatar Apr 09 '20 14:04 hansioan