codemetar icon indicating copy to clipboard operation
codemetar copied to clipboard

Generate a codmeta.json for arbitrary github repo (via GitHub API crosswalking)

Open cboettig opened this issue 6 years ago • 10 comments

It would be great to have a function which could operate on any github repo to generate at least a rough codemeta.json file from whatever information was returned by the GitHub API (basically what the current import to zenodo does).

cboettig avatar Jul 10 '17 21:07 cboettig

on any GitHub repo or on an R package GitHub repo?

maelle avatar Apr 11 '18 13:04 maelle

Am assigning myself since I will need sthg like that for the registry 👼

maelle avatar Apr 18 '18 08:04 maelle

would you want to do that without cloning the repo, @cboettig? If we clone the repo, a complete codemeta.json can be generated though, probably not what you had in mind?

maelle avatar Nov 06 '18 15:11 maelle

Would just be queries to the GitHub API -- e.g. collaborators, name, description, codRepository, programmingLanguage, maybe license, maybe version, maybe a few others

cboettig avatar Nov 06 '18 16:11 cboettig

And now I think I understand better, this would be a new function, for a new use case which is absolutely not specific to repos that are R packages. 👍

Any name ideas for that function?

And any collection of repos you'd like to see this applied to? Maybe something linked to a repo topic, e.g. finding all repos with the "json-ld" topic?

maelle avatar Nov 08 '18 08:11 maelle

Technically the function could work on any GitHub repo, regardless of the topic. Maybe something like github_meta()? I dunno.

If you're looking for an interesting case-study though, it would make more sense to get the codmeta.json from all DataCite software type objects. I suspect > 90% of these come from Zenodo, with > 90% of the Zenodo ones coming from the direct GitHub import, which currently uses GitHub API to generate most of the entries (though users can modify that manually or with the undocumented .zenodo.json file, and hopefully eventually via a codemeta.json file instead).

cboettig avatar Nov 08 '18 16:11 cboettig

Technically the function could work on any GitHub repo, regardless of the topic.

Of course! But I thought it'd be a cool meta example of the usage of this function. 😉

What about collect_github_meta()?

I'm not sure I follow, how would I get a list of repos associated to DataCite software objects?

maelle avatar Nov 08 '18 16:11 maelle

See https://search.datacite.org/works?resource-type-id=software&query= I think this should be accessible from the DataCite API / Content Negotiation as well. We could always bug @mfenner ;-)

I like collect_github_meta() name. Yeah, guess I'm still luke-warm on how useful that function would really be. hmm...

cboettig avatar Nov 08 '18 16:11 cboettig

@cboettig what advantages would this approach have other cloning the repo? That it'd work for something else than R packages? (I'm also wondering how useful this would be)

maelle avatar Feb 18 '21 07:02 maelle

:wave: @cboettig is this issue still "valid"?

maelle avatar Feb 14 '22 11:02 maelle