DataCite Citations
What steps does it take to reproduce the issue?
- When does this issue occur?
When we run script https://guides.dataverse.org/en/latest/_downloads/7e1b7e580244f61d2ac2de279759d154/counter_weekly.sh
- Which page(s) does it occurs on?
None
- What happens?
When we run script to update citations for Datasets, Dataverse uses api end point "updateCitationsForDataset" in src/main/java/edu/harvard/iq/dataverse/api/MakeDataCountApi.java
This method calls Datacite api to take json information about doi (https://api.datacite.org/events?doi="DOI"&source=crossref&page[size]=1000)
When Dataverse has all information, it uses method "parseCitations" in src/main/java/edu/harvard/iq/dataverse/makedatacount/DatasetExternalCitationsServiceBean.java to extract citations information.
This method discriminates between inbound ("cites", "references", "supplements") and outbound ("is-cited-by", "is-referenced-by","is-supplemented-by") relationships. After that, it converts the recived DOI into Dataverse DOI in both cases with:
String globalId = subjectUri.replace("https://", "").replace("doi.org/", "doi:").toUpperCase().replace("DOI:", "doi:");
In our case, our :Shoulder is "data" and globalId is transformed in "DATA" and Dataverse doesn't find globalID. Because Dataverse transforms https://doi.org/10.34810/data146 into doi:10.34810/DATA146
We don't know, because you transform the recived DOI with toUpperCase(). Is it necessary?
When we remove the last part ".toUpperCase().replace("DOI:", "doi:");". The script generates citations correctly.
- To whom does it occur (all users, curators, superusers)?
MakeDataCount API
- What did you expect to happen?
Should Dataverse change the received DOI to uppercase?
In the guide you don't say anything about that and query to put a new :Shoulder has lowercase.
curl -X PUT -d "MyShoulder/" http://localhost:8080/api/admin/settings/:Shoulder
Which version of Dataverse are you using?
We use 5.11.1 version, but the latest version also has toUpperCase().
Any related open or closed issues to this bug report?
No
An example:
https://dataverse.csuc.cat/dataset.xhtml?persistentId=doi:10.34810/data146
But 0 Citations.
An example removing ".toUpperCase().replace("DOI:", "doi:");" 2 Citations
Thanks for reporting the issue! DOIs are supposed to be case insensitive according to https://support.datacite.org/docs/datacite-doi-display-guidelines#dois-urls-and-case-sensitivity, so I think the root problem is not that we use toUpperCase here, but that the comparison elsewhere is case sensitive. The code for handling PIDs has changed significantly since 5.11.1 (with multi PID Provider support), but in trying to reproduce this, I see citation counting is broken in the current release as well, even for those with an upper case shoulder. I'll work to get a fix into the current code. (For 5.11, I'm not sure what can be done without a code fix - using an uppercase shoulder would impact the display and assuming your file/S3 store is case sensitive, where the files need to be - so not a very useful work-around.)