dataverse
dataverse copied to clipboard
Support for Crossref Funder Registry ID
In a recent report (https://doi.org/10.29242/report.effectivedatapractices2020), the Association of Research Libraries (ARL) recommends wide adoption of these 5 core PIDs to power findability of research data:
I suggest that Crossref Funder Registry IDs be implemented in the Citation Metadata section Grant Information.
Funding agency IDs can be RORs, too (https://github.com/IQSS/dataverse/issues/6640). Considering Dataverse's existing infrastructure and conventions, maybe a way to let depositors add one of several types of organization IDs would be best:

The "Grant Agency ID Type" dropdown could include the values allowed by the DataCite schema: Crossref Funder ID, GRID, ISNI, ROR, Other.
Ideally, if a depositor selects a "Grant Agency ID Type", she should enter a "Grant Agency ID", and vice versa. But depositors might not always see the relationship between the "Grant Agency ID Type" field and the "Grant Agency ID" fields, and might provide one piece of information and not the other. We see this often with the related publication fields. The tooltips could state that if one of these fields is filled, the other should be filled, too. Even better, Dataverse could add functionality where if one field is filled, one or more fields become required.
Another way to implement this would require much more development work but I'm sure would result in much more organization IDs being captured (leading to better discovery and tracking of datasets), and could be applied to other fields (author, producer, contributor, etc):
- Add a controlled vocab to the Grant Agency Name field, so when depositors start typing, the field suggests organizations.
- If the depositor selects a suggested name, its ID type and ID are filled in, which can help the depositor be certain that:
- the suggested organization is the one she means to select (for any cases where two or more organizations share similar names)
- the Grant Agency IDs are stored and included in exported metadata
- If the depositor selects a suggested organization, Dataverse adds its IDs to metadata exports whose schemas have fields/properties/elements/etc for that info (so each suggested organization's IDs would be stored in Dataverse somewhere and added to the metadata exports)
- If the agency that the depositor wants to enter isn't suggested, the depositor can add her own agency name, ID type and ID
Also, it would be awesome, as Martin Fenner wrote in IQSS/dataverse#6640, if Dataverse repositories can export the IDs of the funding agency name metadata it already has. This would require bulk updates to the metadata of existing datasets and, for repositories that register DataCite DOIs, sending that metadata to DataCite. (Among the 133,253 datasets whose metadata I've collected from 49 known Dataverse installations, only 4,259 have something in their Grant Agency field and 1,422 have something in a Contributor Name field where Contributor Type = Funder, so that's 5,681 datasets at most with funding agency name metadata, and there's probably some overlap because of the redundant metadata fields (https://github.com/IQSS/dataverse/issues/4859)).
ADA would definitely like to see this. Any thoughts on who should implement? Sounds like a job for the Harvard team for me, but happy to contribute if we can.
Priority review with stefano:
- Moved from NIH Deliverables Backlog to Ordered Backlog
Top priority for upcoming sprint
Sizing:
- In trying to size this we decided that we need to revisit the intent of the issue and where it fits in with 1.5.1
- Going to leave it in the backlog for today and come back to it.
Just adding this before I forget. I’d strongly suggest working on https://github.com/IQSS/dataverse/issues/4859 before any controlled vocabulary functionality is added to the metadata fields for funding information in the Harvard Dataverse (or any Dataverse repository that’s affected by https://github.com/IQSS/dataverse/issues/4859).
Sizing:
- recategorize this as an in-progress deliverable. (mike)
- So far it's made up of https://github.com/IQSS/dataverse/issues/9150 https://github.com/IQSS/dataverse/issues/4859
sizing with Stefano:
- It will be good that we support all 5 listed in the description.
- We now support 4 out of 5 supported.
- Let's determine if we support the 5th.
- If we do we can announce that we support all 5.
- If we don't make a plan to get support for the 5th. https://user-images.githubusercontent.com/21955790/94335172-10523d80-ffda-11ea-91bd-8cdedab43036.png
We will leave this in this backlog as a placeholder on this deliverable.
bklog grooming:
- It's unclear to me with what little Iunderstand of the details whether we should keep this issue and open a deliverable issue to capture all 5 items, or just modify this issue. In my first look, it seems like Phillip buried the lead as the last sentence in the description. "I suggest that Crossref Funder Registry IDs be implemented in the Citation Metadata section Grant Information." Maybe that's the real title for this issue?
- what came out of our conversations today is that we already support 4 of the 5 items, including Funder Registry IDs and that what's left is support for: CrossRef Grant IDs.
- But other meetings have this specific issue already being decomposed as 9150, & 4859.
Next steps:
- I've tentatively opened: https://github.com/IQSS/dataverse-pm/issues/19 to act as a deliverable placeholder for all this work.
- I'm going to get together with hopefully Julian who can describe this to me in terms that a 5 yr old can understand.
Looking forward to talking @mreekie. I recommended that @pdurbin join us, and perhaps @philippconzett since he opened these issues.
In the meantime, I looked through the ARL article again and I'm hoping it's helpful to think of those "5 core PIDs for powering Findability" as principles for generally using persistent identifiers when identifying people and things, like organizations, funders and funding awards. On page 11, the article starts to give a little more detail on that list:

IQSS/dataverse-pm#19 could be an epic. We could look at every existing metadata field that ships with Dataverse (contributors, producers, etc.); ask if what we expect people to enter in those fields could be represented by a persistent identifier; implement something that improves the likelihood that those things are represented by persistent identifiers, like the javascript for Dataverse's Funding Information field(s) (#9150); and evaluate how well it's helping people add that metadata (or however else Dataverse can capture it).
bklog grooming:
- It's unclear to me with what little Iunderstand of the details whether we should keep this issue and open a deliverable issue to capture all 5 items, or just modify this issue. In my first look, it seems like Phillip buried the lead as the last sentence in the description. "I suggest that Crossref Funder Registry IDs be implemented in the Citation Metadata section Grant Information." Maybe that's the real title for this issue?
- what came out of our conversations today is that we already support 4 of the 5 items, including Funder Registry IDs and that what's left is support for: CrossRef Grant IDs.
- But other meetings have this specific issue already being decomposed as 9150, & 4859.
Next steps:
- I've tentatively opened: Complete the implementation of the 5 core PIDs for powering Findability #9300 to act as a deliverable placeholder for all this work.
- I'm going to get together with hopefully Julian who can describe this to me in terms that a 5 yr old can understand.
Talking with Julian.
- The statement that we support 4 out of 5 of these items is not an accurate way to describe this topic. My take. The 5 items are a very large area to cover and the best way to define that we are done is to say we are done for now.
sizing:
- PM added to ordered sizing queue
- This specific issue will apply to the last in the list of 5: CrossRef Grant IDs.
- We can discuss and size this in that context.
Note:
- Separately, IQSS/dataverse-pm#19 is going to act as a collection/deliverable. We will gather our current definition of done for all 5 steps there. No action needed on that as part of sizing.
Sizing:
- Next step. Discuss with Julian, Phil, and Stefano to get on the same page.
Sizing:
- Next step. Discuss with Julian, Phil, and Stefano to get on the same page.
monthly updates:
- We held a meeting on the general topic of providing support for the 5 types of PIDs.
The outcome of that meeting:
- This is a standalone issue and will be sized and worked as above.
- There is a new deliverable label: "Deliverable: 5 Core PIDs" and an "bklog: Deliverable" issue created to coordinate it.
This GitHub issue is in the "SPRINT - NEEDS SIZING" column, but I think that the GitHub issue "Create a javascript for the frontend that supports Fundref" (https://github.com/IQSS/dataverse/issues/9150) represents the work to meet the goals expressed in this issue.
So I'm not sure that this issue needs to be given a size. Does it? When the PR for https://github.com/IQSS/dataverse/issues/9150 is merged, shouldn't it also close this issue?
Prepping for prio meeting:
This GitHub issue is in the "SPRINT - NEEDS SIZING" column, but I think that the GitHub issue "Create a javascript for the frontend that supports Fundref" (#9150) represents the work to meet the goals expressed in this issue.
So I'm not sure that this issue needs to be given a size. Does it? When the PR for IQSS/dataverse#9150 is merged, shouldn't it also close this issue?
@jggautier Thank you. @scolapasta @pdurbin - I'm putting a stake in the sand for Julian's assertion. Pull it out if you disagree.
Next Step:
- During prio meeting in a few minutes, check-in with Gustavo and get this off the board.
backlog grooming
- Based on the how the discussions ended, @jggautier @scolapasta @pdurbin, I am closing this issue as the objective of handling "Crossref grant IDs" is covered already covered by #9150