Citing idutils
I would love to be able to cite idutils in a publication e.g. using a DOI ;)
Is this possible? Does a citable publication exist about idutils? If not you may consider to get a DOI e.g. via Zenodo:
https://guides.github.com/activities/citable-code/
I think it's a great idea to facilitate the citation of idutils. What I'm wondering is how you want to do it. Zenodo offers a way to create a new DOI upon each release. What I'm wondering if how this new release will include a citation file (like CITATION.cff or codemeta.json) that includes a reference to this DOI that was generated by Zenodo AFTER the tagged code (which includes these files) are released.
When I have used Zenodo in the past, I can reserve a DOI before publishing the Zenodo record. Then I manually update the CITATION.cff file or codemeta.json file with this reserved DOI. Then I tag/release the code. So I'm wondering how this Zenodo/Github integration handles this scenario.
FYI, I am actively working on building tools to help automate citation, so if anyone is interested in helping me think through these problems, I welcome it.
At Caltech library we've been working on our own workflow for archiving, using iga https://caltechlibrary.github.io/iga/. It runs as a GitHub action, so you have more control of the archiving process than the built in Zenodo workflow. iga doesn't work with Zenodo yet because they are running their legacy API...but hopefully we'll get that fixed soon.
Our current workflow is to release on GitHub, archive in the repository to get the DOI, in the GitHub action add the DOI to codemeta.json, and then rebuild the CFF. This does mean that the zipped copy of the release has the incorrect version DOI. But the "chicken and egg" problem with the DOI is real and without a great solution. If you have the archiving triggered by a GitHub release...either you have a non-working DOI in the metadata until the release happens or the wrong version DOI in the archived content.
For idutils, the team has to decide whether we're using the Zenodo built-in workflow or iga-> Zenodo. We also have to figure out metadata. But since InvenioRDM powers Zenodo we really should really just get it done.
That's very interesting. I would imagine that for all extrinsic persistent identifiers (i.e., those not based on content, like DOI) and some intrinsic persistent identifiers (those not based on version metadata content) that there is no chicken-or-egg problem for generating the persistent identifer, and that adding them to the source code before tagging and releasing the code is possible and preferable. In these cases, reserving a persistent identifier before publishing it, like reserving a Zenodo identifier in a draft Zenodo record before publishing that Zenodo record, works. It seems preferable to do something like this rather than having the wrong DOI tagged with the source code. However, some intrinsic persistent identifers like SWHIDs, which depend on hashing the files containing the identifer seem to have a chicken-and-egg problem.
It depends on how the archiving workflow is triggered. If the workflow applies the tag, then it could make a draft record to get the DOI, add the DOI to the metadata, apply the tag, and then go through the rest of the workflow. However, this requires researchers to use a specific workflow to release their code....which seems like a big ask. Honestly just getting researchers to do releases is hard enough. I'm very interested in solutions, but I think DOIs are very close to SWHIDs in having the chicken-and-egg problem.