codemetar
codemetar copied to clipboard
Generate a codmeta.json for arbitrary github repo (via GitHub API crosswalking)
It would be great to have a function which could operate on any github repo to generate at least a rough codemeta.json
file from whatever information was returned by the GitHub API (basically what the current import to zenodo does).
on any GitHub repo or on an R package GitHub repo?
Am assigning myself since I will need sthg like that for the registry 👼
would you want to do that without cloning the repo, @cboettig? If we clone the repo, a complete codemeta.json can be generated though, probably not what you had in mind?
Would just be queries to the GitHub API -- e.g. collaborators, name, description, codRepository, programmingLanguage, maybe license, maybe version, maybe a few others
And now I think I understand better, this would be a new function, for a new use case which is absolutely not specific to repos that are R packages. 👍
Any name ideas for that function?
And any collection of repos you'd like to see this applied to? Maybe something linked to a repo topic, e.g. finding all repos with the "json-ld" topic?
Technically the function could work on any GitHub repo, regardless of the topic. Maybe something like github_meta()
? I dunno.
If you're looking for an interesting case-study though, it would make more sense to get the codmeta.json
from all DataCite software
type objects. I suspect > 90% of these come from Zenodo, with > 90% of the Zenodo ones coming from the direct GitHub import, which currently uses GitHub API to generate most of the entries (though users can modify that manually or with the undocumented .zenodo.json
file, and hopefully eventually via a codemeta.json
file instead).
Technically the function could work on any GitHub repo, regardless of the topic.
Of course! But I thought it'd be a cool meta example of the usage of this function. 😉
What about collect_github_meta()
?
I'm not sure I follow, how would I get a list of repos associated to DataCite software
objects?
See https://search.datacite.org/works?resource-type-id=software&query= I think this should be accessible from the DataCite API / Content Negotiation as well. We could always bug @mfenner ;-)
I like collect_github_meta()
name. Yeah, guess I'm still luke-warm on how useful that function would really be. hmm...
@cboettig what advantages would this approach have other cloning the repo? That it'd work for something else than R packages? (I'm also wondering how useful this would be)
:wave: @cboettig is this issue still "valid"?