dissemin icon indicating copy to clipboard operation
dissemin copied to clipboard

Papers in CORE are missing

Open fxcoudert opened this issue 6 years ago • 9 comments

For example, this paper https://dissem.in/p/53653732/defects-and-disorder-in-metal-organic-frameworks is listed as not available in OA version, but it is present in CORE: https://core.ac.uk/display/35279775?recSetID=

CORE is hugely important because UK has been leading in terms of deposition mandates (both by UK funders and academic institutions) and CORE is the solution chosen by many universities (Cambridge among others) to integrate their Open Access publications.

fxcoudert avatar Apr 30 '19 16:04 fxcoudert

@fxcoudert err... that's news to me. I believe Cambridge actually direct their authors to use their institutional repository (branded 'Apollo') for OA compliance https://www.repository.cam.ac.uk/

e.g. https://www.openaccess.cam.ac.uk/funder-open-access-policies/policy-compliance-tree deposits to Apollo

I believe CORE is merely a secondary aggregator of content from institutional repositories worldwide, not a primary place of deposition (but perhaps I am mistaken?).

With the specific paper, it's clear it is originally deposited at the Cambridge IR (Apollo) "Downloaded from https://www.repository.cam.ac.uk/handle/1810/253143"

rossmounce avatar Apr 30 '19 21:04 rossmounce

...but this issue is a good one because it affects one of my papers! https://dissem.in/p/93734182/ex-situ-conservation-of-plant-diversity-in-the-worlds-botanic-gardens

I know the full text of this is available from the Cambridge IR because I made sure it was there! https://www.repository.cam.ac.uk/handle/1810/270235

So the bug is real, but it is possibly a problem of the Cambridge IR not CORE(?)

rossmounce avatar Apr 30 '19 21:04 rossmounce

here's another Cambridge IR deposited paper of mine that isn't seemingly picked-up in Dissem.in

Mounce, R. C. P., Sansom, R., & Wills, M. A. (2016). Sampling diverse characters improves phylogenies: Craniodental and postcranial characters of vertebrates often imply different trees. Evolution, 70 (3), 666-686. https://doi.org/10.1111/evo.12884

Dissemin page: https://dissem.in/p/41558077/craniodental-and-postcranial-characters-of-vertebrates-often-imply-different-trees-why-characters-should-be-sampled-holistically

fulltext AAM at the Cambridge repo: https://www.repository.cam.ac.uk/handle/1810/254018

fulltext AAM also at the University of Bath repo: https://researchportal.bath.ac.uk/en/publications/sampling-diverse-characters-improves-phylogenies-craniodental-and

and fulltext at CORE too: https://core.ac.uk/download/pdf/35280560.pdf

rossmounce avatar May 01 '19 14:05 rossmounce

CORE is currently not harvested, see https://dissem.in/sources Harvesting both would produce a lot dublettes, but if there is enough divergence in data, harvesting CORE is probably an option.

But both, Bath and Cambridge are harvested by BASE.

https://www.base-search.net/Record/9b48a7e3d71994ba16798a7a6b219c1877c143ac3756993f755739c4bff85e0e/

https://www.base-search.net/Record/003aee3b4cb00d6020816837802374454d895d90dded558df39b7fd6feda6c2e/

https://dissem.in/p/41558077/ gives on the link to Bath repo a 404 which is bad.

I think is related to #602.

beckstefan avatar May 02 '19 09:05 beckstefan

https://dissem.in/p/53653732/ does not even give a link to the BASE entry, which exist: https://www.base-search.net/Record/f07553567ed95a5f428d4b4a03997cf84bd1471c2d2d640d4926ef0c79422527/

fxcoudert avatar May 02 '19 11:05 fxcoudert

@beckstefan I understand why https://dissem.in/p/41558077/ gives a 404.

At the time, the Bath repo was at opus.ac.uk at some point Bath have (I assume?) changed the baseurl without putting in-place any redirects(!), the corresponding item can now be found at: https://researchportal.bath.ac.uk/en/publications/sampling-diverse-characters-improves-phylogenies-craniodental-and

rossmounce avatar May 02 '19 12:05 rossmounce

@fxcoudert Yes, that shouldn't be the case.

@rossmounce This was my first suspect to, but the that's not the case. The item identifier in Dissemin is causing the issues. It's not the same as in the my given Base Record and probably wrong. Base still uses the old domain.

beckstefan avatar May 02 '19 12:05 beckstefan

https://www.base-search.net/Record/003aee3b4cb00d6020816837802374454d895d90dded558df39b7fd6feda6c2e/ links http://opus.bath.ac.uk/49927/ which redirects to https://researchportal.bath.ac.uk/en/publications/sampling-diverse-characters-improves-phylogenies-craniodental-and . It can also be that http://opus.bath.ac.uk/48796/ at some point existed and was later deleted as a duplicate, but not before BASE/Dissemin ingested it. BASE no longer has the old record, probably because it has since re-harvested the whole OAI-PMH endpoint from scratch, but Dissemin doesn't have a way to delete previous records for a single end-point and start from scratch, so the up to date record just gets discarded as a duplicate (?) upon reimport of BASE.

nemobis avatar Jul 16 '19 10:07 nemobis

BASE no longer has the old record, probably because it has since re-harvested the whole OAI-PMH endpoint from scratch, but Dissemin doesn't have a way to delete previous records for a single end-point and start from scratch, so the up to date record just gets discarded as a duplicate (?) upon reimport of BASE.

Something like this can be the case, thanks for your research.

OAI-PMH requires to communicate if an item is no longer available, so a reimport from scratch is not necessary. But yes, removing items is something that should be on dissemins agenda.

beckstefan avatar Jul 16 '19 11:07 beckstefan