metacpan-web icon indicating copy to clipboard operation
metacpan-web copied to clipboard

Make metadata available as Linked Data

Open kjetilk opened this issue 6 years ago • 3 comments

I mentioned this briefly on IRC, and @oalders encouraged me to submit it here.

Some years ago, @tobyink made all of CPAN available as Linked Data with RDF using an old version of the MetaCPAN API. Since then, it has stopped operating, and while @tobyink may redo it, I'm thinking it might need to be done slightly differently, and based on a new framework.

Some context: I'm creating a framework where tests can be formulated in RDF for the Solid Project, where test reports can also be returned as RDF. Linked Data has a basic principle that you give everything a URL to identify it, and then you return some RDF if someone GETs it. For this, I need an RDF URL for everything that goes into those tests, and that's a bunch of CPAN stuff. We are also formulating module metadata as RDF using the DOAP vocabulary, which means that not only can we have a consistent presentation across projects, we can also store all this and query and analyze it. We can also link to Debian and Git, which also exposes RDF.

Once done, I figured it makes sense to have this as part of MetaCPAN rather than a standalone project like @tobyink 's were.

For this, I need URLs for

  • authors
  • distributions
  • individual releases
  • individual modules

So, the question is then how the URLs should look. Perhaps it would make sense to write them with ld in the URL, for example https://metacpan.org/ld/?

The author URL would straigthforwardly be https://metacpan.org/ld/author/$pause_id

I think that https://metacpan.org/ld/distribution/$dist could reasonably always refer to metadata for the distribution as of the latest version, where it has links to versions with their changelogs at https://metacpan.org/ld/distribution/$dist/$version

I'm a bit conflicted on whether the list of modules should have versions as well, and if they should perhaps live under their distribution. I don't have a concrete use case for that right now, so I'm think that perhaps only the latest release needs to have RDF metadata. So, perhaps https://metacpan.org/ld/module/$module would suffice.

Do you think this is something MetaCPAN could have? If so, I might get around to contribute the code.

kjetilk avatar Oct 25 '19 23:10 kjetilk

My version is working again. I think what's wrong is it's often running out of memory when it rebuilds its database. The VPS it's hosted on is really old. I should move hosting some time; there are much cheaper and better hosting packages available these days.

tobyink avatar Nov 06 '19 00:11 tobyink

Great stuff, @tobyink !

Perhaps we could host it on my box, and perhaps collaborate on it?

kjetilk avatar Nov 06 '19 08:11 kjetilk

Would it be a good idea to version the ld itself, e.g. ldv1?

mohawk2 avatar Jul 03 '20 14:07 mohawk2