HydroShare minted DOIs do not include author information in DOI metadata
Description of the bug
DOIs minted with the prefix, 10.4211/hs. (HydroShare minted DOIs) do not contain author information. This has several unintended consequences (that I can think of off hand):
- Trackers cannot link a person's work stored on HydroShare to a person's collection of work (google scholar, etc.) using only a DOI.
- DOIs minted by HydroShare do not contain enough information to properly cite a resource using only information from the DOI. A user would need to visiting HydroShare to obtain more information.
- HydroShare DOIs are not records of who contributed work, but instead they are records of the work itself.
- DOIs minted by HydroShare are dependent on HydroShare. If HydroShare is to go down, there is no record of who did work. There is only record that CUAHSI was the publisher of that work.
Showing this is the case:
# at time of opening this, 10.4211 has minted 898 DOI's. Hence, rows = 1000
# select field 'DOI' and 'author' from DOIs with prefix '10.4211'
# filter results to only include DOIs that have an author field
$ curl "https://api.crossref.org/prefixes/10.4211/works?select=DOI,author&rows=1000" | \
jq '.message.items |
map(select(.author != null) | .DOI )'
[
"10.4211/spatialdata-johnstonc01",
"10.4211/techrpts.20110517.tr10",
"10.4211/his-data-shalenetwork",
"10.4211/technical.20171009",
"10.4211/techrpts.200605.bgc",
"10.4211/techrpts.200911.tr8",
"10.4211/his-5644-agci-irondataset",
"10.4211/technical.20161019",
"10.4211/techrpts.200208.tr1",
"10.4211/sciplan.waters.20090515",
"10.4211/datacenterspec.20120611",
"10.4211/his-data-cedarriverforestsnow",
"10.4211/techrpts.200605.geo",
"10.4211/stratplan.201012",
"10.4211/techrpts.200412.tr6",
"10.4211/techrpts.200208.tr3",
"10.4211/techrpts.20100616.tr9",
"10.4211/annual.201111",
"10.4211/52abfd80ae794c0588d133897dcfafa3",
"10.4211/techrpts.200208.tr4",
"10.4211/techrpts.20110317.tr10",
"10.4211/his-5654",
"10.4211/techrpts.200208.tr2",
"10.4211/techrpts.200605.wc",
"10.4211/spatialdata-glhymps",
"10.4211/sciplan.200711"
]
Retrieve DOI metadata for a published hydroshare resource:
$ curl -LH "Accept: application/vnd.citationstyles.csl+json" https://doi.org/10.4211/hs.546fa3feeaf242fc8aabf9fe05ab454c | jq
{
"indexed": {
"date-parts": [
[
2021,
12,
14
]
],
"date-time": "2021-12-14T20:11:09Z",
"timestamp": 1639512669827
},
"reference-count": 0,
"publisher": "Consortium of Universities for the Advancement of Hydrologic Science, Inc. (CUAHSI)",
"content-domain": {
"domain": [],
"crossmark-restriction": false
},
"DOI": "10.4211/hs.546fa3feeaf242fc8aabf9fe05ab454c",
"type": "dataset",
"created": {
"date-parts": [
[
2018,
11,
25
]
],
"date-time": "2018-11-25T23:47:46Z",
"timestamp": 1543189666000
},
"source": "Crossref",
"is-referenced-by-count": 0,
"title": "NOAA NHC - Irma Storm Track - Advisories Track",
"prefix": "10.4211",
"member": "2730",
"container-title": "HydroShare Resources",
"original-title": [],
"deposited": {
"date-parts": [
[
2018,
11,
25
]
],
"date-time": "2018-11-25T23:47:46Z",
"timestamp": 1543189666000
},
"score": 1,
"subtitle": [],
"short-title": [],
"issued": {
"date-parts": [
[
null
]
]
},
"references-count": 0,
"URL": "http://dx.doi.org/10.4211/hs.546fa3feeaf242fc8aabf9fe05ab454c",
"relation": {}
}
Try to retrieve citation using only DOI minted by HydroShare:
$ curl -LH "Accept: text/bibliography; style=mla" https://doi.org/10.4211/hs.546fa3feeaf242fc8aabf9fe05ab454c
“NOAA NHC - Irma Storm Track - Advisories Track.” HydroShare Resources. Crossref, https://doi.org/10.4211/hs.546fa3feeaf242fc8aabf9fe05ab454c.
Yes, at the time when resource publication was implemented, it was decided only resource title would be deposited as crossref metadata. Agreed adding author information in crossref metadata sounds like a good idea.
The metadata schemas of DOI registrars provide the opportunity to deposit quite a lot of metadata. Doing so could be important for people (or applications) who may be interested in getting metadata for HydroShare resources using only the DOI. Yes, you can resolve the DOI and then go to HydroShare and harvest metadata, but the DOI registrars are now starting to provide standardized metadata across DOI registrars, so I think we should consider a more complete mapping of our resource metadata to the metadata schema of our DOI registrar so that it could be deposited upon DOI creation. The downside of doing this would be that if we deposit metadata that is subject to change for published resources, that might have to be updated when edits are made to published resources.
Yes, at the time when resource publication was implemented, it was decided only resource title would be deposited as crossref metadata. Agreed adding author information in crossref metadata sounds like a good idea.
Thanks for providing history and context @hyi!
so I think we should consider a more complete mapping of our resource metadata to the metadata schema of our DOI registrar so that it could be deposited upon DOI creation.
@horsburgh I completely agree with you. What you are describing sounds much larger in scope than just adding author information to existing HydroShare minted DOIs. We may want to open a related ticket that links or details the canonical DOI schema and discusses HydroShare resource metadata fields that should be included when HydroShare mints a DOI.
The downside of doing this would be that if we deposit metadata that is subject to change for published resources, that might have to be updated when edits are made to published resources.
To your point, I agree with you that ideally we would want more extensive metadata in minted DOIs, however I personally, I am in favor of a hot-fix that updates existing HydroShare DOIs with low-hanging fruit fields first (i.e. author, ORCID, etc.). Your proposal sounds like it would require a bit of development work and planning (which I totally support doing). I think the former would allow us to learn about the updating process and use that knowledge to push the development of a more holistic and robust DOI minting mechanism.
Looks like we can resubmit and they encourage it. https://www.crossref.org/blog/open-abstracts-where-are-we/
Publishers who wish to distribute their abstracts openly through Crossref can include them in the normal content registration process. They can send XML to Crossref (using Crossref's metadata deposit schema), either directly via HTTPS POST or via the Crossref admin system. For back-content, a resubmission of the full XML is required. In addition, various tools can be used to deposit abstracts. Open Journal Systems (OJS) has a plugin that supports the depositing of abstracts. Metadata Manager also facilitates this, but only for journal articles. Crossref's web deposit form does not yet support abstracts, but Crossref is working on this.
@aaraney - Yes, I think this can be staged. DOI metadata is meant to be updated, and I think doing it in stages while focusing on the highest value metadata (e.g., author information) seems reasonable.
This issue seems related, but please feel free to move my comment to a new issue if that is a better fit.
How do I include the organization's name in the resource citation? Should this be more explicitly supported in what the user wants to include in the citation? Should the citation also include the resource type, such as a report, dataset, paper etc?
Here is a citation of my resource, and I'd love to include the publisher organization and resource type (report) Abdallah, A., T. Willardson, R. James (2022). The 2022 National Water Use Data Workshop Summary Report, HydroShare, https://doi.org/10.4211/hs.61025a06d0644d1abf407dc97394d664
@amabdallah - HydroShare is a repository, so if you are sharing content in HydroShare and you are expecting HydroShare to be the source of that content, then HydroShare is the publishing organization. Currently, HydroShare only has one resource type. We don't include content types in the citations for resources. You've permanently published this resource and have gotten a DOI through HydroShare, so that means that HydroShare is the publisher of the content you have provided.
If you want to edit the citation for a resource, we only allow that for referenced content. So, if the report is available somewhere else on the web and you want to include a reference to it in your HydroShare resource, you can use HydroShare's referenced content functionality to do that. If you add referenced content to your resource, then you will be able to edit the citation of the resource. The tradeoff is that since we can't guarantee what is on the other end of your reference, we don't allow HydroShare users to permanently publish a resource that contains referenced content. HydroShare's policy is to not issue DOIs for content that we can't guarantee will be available.