ReScience icon indicating copy to clipboard operation
ReScience copied to clipboard

Archive github auxiliary data

Open pdebuyl opened this issue 4 years ago • 12 comments

As mentioned in https://github.com/ReScience/submissions/issues/41 (and probably discussed earlier, I can't find links though), at some point ReScience should archive all repositories and the extra data, specifically issues and pull request discussions.

pdebuyl avatar Jun 30 '20 17:06 pdebuyl

We're doing that at JOSS. Can you brief here @arfon?

labarba avatar Jun 30 '20 18:06 labarba

Probably we need to switch to the JOSS framework at some point, it will make things easier for everyone (especially me). However, we have some specificities and I don't know if this is compatible:

  1. We have volume / issue / special issue
  2. We have different type of articles
  3. We (now ) use software heritage instead of Zenodo

There are probably some other specificities that I don't remember right now but they're certainly minor. @labarba @arfon do you know what would be the cost of switching in terms of man/month ? (and do we need to hire someone to do that)

cc to @benoit-girard @khinsen @oliviaguest

rougier avatar Jun 30 '20 18:06 rougier

None of the things you mention have any impact on adopting the JOSS framework. JOSS has volume/issue as well. Special issue sounds like a minor tweak (but we're adopting theme tracks in JOSS soon, so could be related). To link a paper to the archive, be it Zenodo or another one, all you need is the global identifier and a command: @whedon add <id> as archive. This embeds the identifier in the references metadata field of the Crossref deposit. (Not sure if whedon checks that it is a DOI, but perhaps a tweak is needed if the URL associated with the identifier is not on the doi.org domain.)

labarba avatar Jun 30 '20 19:06 labarba

Ok great. For software heritage, I meant for the source code. For the article we still use Zenodo but we can switch to crossref.

rougier avatar Jun 30 '20 20:06 rougier

Gah. Yes, I was talking about the software archive: we use Zenodo, you use Software Archive... it doesn't matter because all we need is a global identifier. The bot whedon makes sure that ID goes into the XML file that is deposited with Crossref.

Note that Crossref is only a registry. What is deposited there is the metadata of the paper only.

JOSS makes permanent archival of the paper using Portico.

labarba avatar Jun 30 '20 20:06 labarba

As mentioned in ReScience/submissions#41 (and probably discussed earlier, I can't find links though), at some point ReScience should archive all repositories and the extra data, specifically issues and pull request discussions.

We're doing that at JOSS. Can you brief here @arfon?

Not quite. We archive a JSON representation of the JOSS review thread once the paper has been accepted and published. This archive is deposited with the paper PDF and the Crossref metadata with Portico. As @labarba mentioned, the software is archived with Zenodo (typically) but that is usually only the software and not the rest of the repository activity (issues, pull requests etc.)

We (now ) use software heritage instead of Zenodo

I suspect this is fine for Whedon to handle. We'd probably have to make a small tweak to allow Whedon to work with a non-DOI string for the archive but I suspect this wouldn't be too hard.

There are probably some other specificities that I don't remember right now but they're certainly minor. @labarba @arfon do you know what would be the cost of switching in terms of man/month ? (and do we need to hire someone to do that)

~20 hours I suspect although we should probably have a detailed technical discussion to be sure.

arfon avatar Jul 05 '20 12:07 arfon

I want to add as a prospective author (we have a draft in near-ready state) I would be much happier if ReScience made the hop to Open Journals, and adopted that system—in particular the Crossref DOI and Portico archive. In the current ReScience publication flow, I'm iffy about the Zenodo DOI for a paper and the fact it is not captured by Google Scholar.

labarba avatar Jul 05 '20 14:07 labarba

Thank you for all the info! I really struggle to do stuff without a @whedon-like bot. Very excited to move to something more JOSS-like.

oliviaguest avatar Jul 05 '20 15:07 oliviaguest

Ok. I'll try to find some time (and/or student) this summer to do the transition if everyone agrees. @khinsen @benoit-girard @pdebuyl Any opinion ?

rougier avatar Jul 05 '20 19:07 rougier

I am all for attempting this transition, but I can't contribute much to it unfortunately, having none of the required technical competences and too many prior obligations for the coming months. What I can contribute to, however, is testing prototypes.

How about setting up a "dummy ReScience" as a prototype? A completely separate structure, attached to another GitHub organization. And do the transition only when we are happy with the prototype.

khinsen avatar Jul 06 '20 06:07 khinsen

I am happy to help if/when I can!

oliviaguest avatar Jul 06 '20 21:07 oliviaguest

I support the idea to transition but haven't much time to assist in the coming weeks.

pdebuyl avatar Jul 15 '20 11:07 pdebuyl