ReScience Archive github auxiliary data

As mentioned in https://github.com/ReScience/submissions/issues/41 (and probably discussed earlier, I can't find links though), at some point ReScience should archive all repositories and the extra data, specifically issues and pull request discussions.

Jun 30 '20 17:06 pdebuyl

We're doing that at JOSS. Can you brief here @arfon?

Jun 30 '20 18:06 labarba

Probably we need to switch to the JOSS framework at some point, it will make things easier for everyone (especially me). However, we have some specificities and I don't know if this is compatible:

We have volume / issue / special issue
We have different type of articles
We (now ) use software heritage instead of Zenodo

There are probably some other specificities that I don't remember right now but they're certainly minor. @labarba @arfon do you know what would be the cost of switching in terms of man/month ? (and do we need to hire someone to do that)

cc to @benoit-girard @khinsen @oliviaguest

Jun 30 '20 18:06 rougier

None of the things you mention have any impact on adopting the JOSS framework. JOSS has volume/issue as well. Special issue sounds like a minor tweak (but we're adopting theme tracks in JOSS soon, so could be related). To link a paper to the archive, be it Zenodo or another one, all you need is the global identifier and a command: @whedon add <id> as archive. This embeds the identifier in the references metadata field of the Crossref deposit. (Not sure if whedon checks that it is a DOI, but perhaps a tweak is needed if the URL associated with the identifier is not on the doi.org domain.)

Jun 30 '20 19:06 labarba

Ok great. For software heritage, I meant for the source code. For the article we still use Zenodo but we can switch to crossref.

Jun 30 '20 20:06 rougier

Gah. Yes, I was talking about the software archive: we use Zenodo, you use Software Archive... it doesn't matter because all we need is a global identifier. The bot whedon makes sure that ID goes into the XML file that is deposited with Crossref.

Note that Crossref is only a registry. What is deposited there is the metadata of the paper only.

JOSS makes permanent archival of the paper using Portico.

Jun 30 '20 20:06 labarba

As mentioned in ReScience/submissions#41 (and probably discussed earlier, I can't find links though), at some point ReScience should archive all repositories and the extra data, specifically issues and pull request discussions.

We're doing that at JOSS. Can you brief here @arfon?

Not quite. We archive a JSON representation of the JOSS review thread once the paper has been accepted and published. This archive is deposited with the paper PDF and the Crossref metadata with Portico. As @labarba mentioned, the software is archived with Zenodo (typically) but that is usually only the software and not the rest of the repository activity (issues, pull requests etc.)

We (now ) use software heritage instead of Zenodo

I suspect this is fine for Whedon to handle. We'd probably have to make a small tweak to allow Whedon to work with a non-DOI string for the archive but I suspect this wouldn't be too hard.

There are probably some other specificities that I don't remember right now but they're certainly minor. @labarba @arfon do you know what would be the cost of switching in terms of man/month ? (and do we need to hire someone to do that)

~20 hours I suspect although we should probably have a detailed technical discussion to be sure.

Jul 05 '20 12:07 arfon

I want to add as a prospective author (we have a draft in near-ready state) I would be much happier if ReScience made the hop to Open Journals, and adopted that system—in particular the Crossref DOI and Portico archive. In the current ReScience publication flow, I'm iffy about the Zenodo DOI for a paper and the fact it is not captured by Google Scholar.

Jul 05 '20 14:07 labarba

Thank you for all the info! I really struggle to do stuff without a @whedon-like bot. Very excited to move to something more JOSS-like.

Jul 05 '20 15:07 oliviaguest

Ok. I'll try to find some time (and/or student) this summer to do the transition if everyone agrees. @khinsen @benoit-girard @pdebuyl Any opinion ?

Jul 05 '20 19:07 rougier

I am all for attempting this transition, but I can't contribute much to it unfortunately, having none of the required technical competences and too many prior obligations for the coming months. What I can contribute to, however, is testing prototypes.

How about setting up a "dummy ReScience" as a prototype? A completely separate structure, attached to another GitHub organization. And do the transition only when we are happy with the prototype.

Jul 06 '20 06:07 khinsen

I am happy to help if/when I can!

Jul 06 '20 21:07 oliviaguest

I support the idea to transition but haven't much time to assist in the coming weeks.

Jul 15 '20 11:07 pdebuyl

ReScience ReScience copied to clipboard

Archive github auxiliary data

ReScience
ReScience copied to clipboard