codemeta icon indicating copy to clipboard operation
codemeta copied to clipboard

Entry points (or API endpoints) extension for codemeta?

Open proycon opened this issue 6 years ago • 5 comments

Schema.org defines EntryPoint as an entrypoint (a URL in some web-based protocol) to software described by the (EntryPoint.actionApplication) property.

Python Packages also define entrypoints in their metadata (see https://github.com/proycon/codemetapy/blob/master/setup.py#L35 for example), referring to the command line scripts that will be installed and bound to execute a particular function in a python module. I would like to represent this in codemeta but come across a few issues:

  1. schema.org EntryPoints are currently rather web-centered and do not take command line tools into consideration, I could cheat this a bit by simply using the file:/// scheme in the URL to reference an executable script, but that is probably not the most elegant solution. Is more needed? Ideally, I'd be in favour of some kind of interfaceType property on EntryPoint (with values like command line interface, graphical user interface, text user interface, web-user interface, webservice interface).
  2. There is only the EntryPoint.actionApplication property, I am missing a reverse property on SoftwareApplication or SoftwareSourceCode to link to entrypoints, I'd propose something simple like codemeta:entryPoints.
  3. Though this is not my take on it, one might argue that command line entrypoints could/should be SoftwareApplication in their own right?

I have implemented 1) and 2) as an optional extension in the brand new Python Codemeta tool (#182).

proycon avatar Apr 18 '18 08:04 proycon

Thanks for raising the issue. can you outline a bit of the use case behind documenting generic entrypoints in the codemeta metadata? I see your point about an EntryPoint taking an object of type SoftwareApplication type as it's actionApplication while not having a reverse property, (though as you observe JSON-LD provides us @reverse), but in general it is easier to imagine why you'd want the link in that direction. My concern is that describing the entrypoints as part of the software metadata doesn't support any actionable task (because more context is necessary to understand what it does), but merely makes the metadata description more complex. In general, it sounds like other software connected in such a manner might be more natural to list in softwareRequirements like any other dependency? but perhaps I don't follow you're envisioned use case.

independent of that, it is still a good question whether schema.org's EntryPoint should remain a specifically web-centric concept or be expanded with a type; might be worth raising that directly with Schema.org. (at the moment, even the web-sense of EntryPoint isn't part of the codemeta context).

cboettig avatar Aug 30 '18 17:08 cboettig

Thanks for your input!

can you outline a bit of the use case behind documenting generic entrypoints in the codemeta metadata?

It's quite common for a software project, a single code base i.e. a single codemeta:SoftwareSourceCode, to provide multiple entrypoints in the form of multiple command line tools. A Python distutils setup.py explicitly lists those (see https://github.com/proycon/folia/blob/master/setup.py#L35 for a project where I have a whole bunch), and I wouldn't want to lose this in conversion to codemeta. And of course this is not just a Python thing (https://github.com/LanguageMachines/frog/blob/master/src/Makefile.am#L3 for a autoconf/C++ example of multiple binaries for command-line tools generated from a single project). Or consider a project like VLC which provides vlc (a GUI client) and cvlc (cli tool), which I'd consider clear examples of two entrypoints to the same thing.

My practical use case is that I developed a webportal (source: https://github.com/proycon/labirinto, production installation: https://webservices-lst.science.ru.nl) that lists software purely on the basis of codemeta (and my proposed entrypoint extensions to it), the software can be filtered on the basis of the types of interfaces they provide (my proposed EntryPoint.interfaceType). I think thing specifying the interface types (deliberate plural) of software is valuable information that does belong in software metadata. The entrypoints for web applications are directly actionable from the portal, the others are more informational but might be actionable in other contexts (like a shell environment in the case of CLI entrypoints, a graphical desktop menu/launcher for GUI entrypoints...).

My concern is that describing the entrypoints as part of the software metadata doesn't support any actionable task (because more context is necessary to understand what it does), but merely makes the metadata description more complex.

I agree it adds complexity, but I guess it is optional complexity that shouldn't get in the way much as not everybody needs this level of expressiveness. I hope my example shows that there is actionability, though limited. The actionability is mostly in question where RESTful (or other) webservices are concerned, as the API are things we of course don't want to cover, but still it can communicate at least where the webservice can be found.

If you're proposing encoding every single entrypoint as a softwareApplication in its own right, then I think that would create a lot of unnecessary duplication, after all it's the same softwareSourceCode with identical metadata.

In general, it sounds like other software connected in such a manner might be more natural to list in softwareRequirements like any other dependency? but perhaps I don't follow you're envisioned use case.

There is a big semantic difference I think, so I don't think that would work; dependencies are things that are required by software, prerequisites that must be satisfied, entrypoints are provided by the software.

I hope the idea is a bit clearer now :)

proycon avatar Aug 30 '18 21:08 proycon

ah right, I think I see now, not enough coffee before. I think I have a clear picture of what is meant by EntryPoint in a context like a REST API or other web URL, but honestly not quite grokking what this means in the the python of context, let alone more generally. Sounds like you are describing the software API, or really any interface to the program? Would you call the headers of a C library the "EntryPoint"? The space of http entrypoints is certainly a lot easier to define than a more generic one.

I could certainly see a web application listing entryPoints in a codemeta description (though I'm wary of trying to re-invent WADL or SOAP via software metadata). I agree that @reverse actionApplication is a lot less intuitive than "entryPoints".

But I have a hard time seeing the semantic meaning in merely listing something like https://github.com/proycon/folia/blob/master/setup.py#L35-L59 as being the 'entrypoints'. To me, this looks relatively unintelligible as metadata. (e.g. are meaningful only in that specific context -- for that to make any sense semantically we'd need to know what kind of object a foliatools.folia2annotatedtxt:main was).

cboettig avatar Aug 30 '18 22:08 cboettig

My 2 cents: documenting the APIs endpoints for REST or other types of applications is best handled by the specs that are designed to do that, specifically OpenAPI v3 (https://swagger.io/docs/specification/about/). In CodeMeta, I could see us having a metadata field pointing to an OpenAPI spec for a piece of software, but it seems to me that we shouldn't try to head down that path ourselves. Also, knowing the API for a piece of software is different from knowing the specific deployment details (such as http URIs) for specific deployed instances of that software. We should be sure to make those distinctions clear.

mbjones avatar Sep 05 '18 04:09 mbjones

Also, knowing the API for a piece of software is different from knowing the specific deployment details (such as http URIs) for specific deployed instances of that software. We should be sure to make those distinctions clear.

Very good point indeed. I agree that is insufficiently clear in what I suggested thus-far.

... but honestly not quite grokking what this means in the the python of context, let alone more generally. Sounds like you are describing the software API, or really any interface to the program? Would you call the headers of a C library the "EntryPoint"? The space of http entrypoints is certainly a lot easier to define than a more generic one.

I see entrypoints like executable/callable interfaces in general yes, so C headers or Python modules or whatever could indeed be classified under that (with a particular type) if we really want to be generic (without going into any further specifics of how to call the API, other than linking to documentation/specifications), but that's already going a deeper than I was focussing on and we don't need to go there yet perhaps.

One of the more practical examples I attempted to describe was a software package offering multiple executables (like the VLC example I gave with vlc vs cvlc), I want to be able to specify in the metadata what executables/callables are provided by a software package and what kind of interface they have (without going into a detailed specification).

But I have a hard time seeing the semantic meaning in merely listing something like https://github.com/proycon/folia/blob/master/setup.py#L35-L59 as being the 'entrypoints'. To me, this looks relatively unintelligible as metadata. (e.g. are meaningful only in that specific context -- for that to make any sense semantically we'd need to know what kind of object a foliatools.folia2annotatedtxt:main was)

Let me try to clarify: I want to list only the first part here, i.e. the executables provided by the software (e.g. what will be installed somewhere in the user's $PATH and what a user can type on the command line or invoke through other means like a menu, so they are actionable and useful information imho). That's that what I meant with the Python entrypoints example I gave (the foliatools.folia2annotatedtxt:main part is indeed very Python-specific and not for codemeta). Moreover, important to me is the ability to specify the interface type of the entrypoint, so stating the fact that those are in fact command line tools or a GUI tool or a webservice.

If it is too much of a stretch to use schema's EntryPoint for this, as that's mostly web-focussed, I can understand that of course, but then I think we should come up with an alternative? The idea to use EntryPoint came from using as much of the existing model as possible (and Python distutils calls it entrypoints too), and I do see a generic similarity between say an executable and a Web Entrypoint, differing mostly in interface type. Though I agree with @mbjones that a clearer distinction needs to be made with instrallation specific issues such as where the entry point is deployed in a particular context (what hostname etc, or for an executable say /usr/local/bin/ vs /usr/bin, probably not part of codemeta as it's not an inherent part of the software, and what is provided (e.g. the name of the executable) )

In CodeMeta, I could see us having a metadata field pointing to an OpenAPI spec for a piece of software

Yes, such a link to a specification, API or even more generic documentation in general is exactly what I'd envision to be associated with EntryPoints, and not just relevant for web-endpoints.

Sorry for the long posts, hopefully this clears up some confusion and doesn't add too much new confusion ;)

proycon avatar Sep 05 '18 11:09 proycon