problem-solving icon indicating copy to clipboard operation
problem-solving copied to clipboard

To what extent are resources part of the API of a distribution?

Open JJ opened this issue 5 years ago • 12 comments

The current META specification allows the installation, and easy access to, resources, data and configuration files that containt, to a certain extent, functionality or a certain way to tune it.

At the same time, it's advisable to use semantic versioning, the api<> meta or combinations of both, for versioning the distribution API. However, it's not totally clear if these resources are a part of the API or not; this means that it's not clear if a client of the distribution should rely on them being stable (in format and/or location) until a major version changes.

For instance: let's suppose you want to use a template from Documentable. There are quite a few of them:

https://github.com/Raku/Documentable/blob/master/META6.json#L56-L61

Can we rely on template/main.mustache being called the same and contain the same kind of thing, at least while the major version does not change? None of those? Just the name, but not the content? Its existence, but not the name or content?

It's not clear how that is used in the ecosystem right now; a distribution using other's resources is probably marginal, and if and when they do, they probably rely on other kind of interfaces (like downloading directly from the GitHub repo); however,that is clearly not spec and can clearly break at any moment.

So, finally, are resources part of the API of a distribution? Should they be? If they should, what is part of the API and what isn't? How should we specify them?

Update: edited with an example to clarify what I mean.

JJ avatar Sep 25 '20 10:09 JJ

I’m pretty sure Distribution is both documented and specced: $dist.content('resources/config.txt').open(:bin).slurp.decode.chars;

ugexe avatar Sep 25 '20 11:09 ugexe

Removing my assignment as I believe both @ugexe and @niner are far better placed to answer this than I am.

jnthn avatar Sep 25 '20 12:09 jnthn

I’m pretty sure Distribution is both documented and specced: $dist.content('resources/config.txt').open(:bin).slurp.decode.chars;

I'm not totally sure what this is counter-argument to. So maybe I'll add an example for clarification, since this probably means I didn't make myself totally clear.

JJ avatar Sep 25 '20 12:09 JJ

See https://github.com/rakudo/rakudo/issues/3821 for my more detailed answer

ugexe avatar Sep 25 '20 12:09 ugexe

@ugexe thanks a lot for that clarification. So the answer is "No, and it shouldn't be", I guess.

JJ avatar Sep 25 '20 12:09 JJ

I generally just try to not encourage the use of %?RESOURCES for anything beyond the most basic things. Its API is not really compatible with non-file system (not to be construed with ::FileSystem) CompUnit::Repository -- for instance the CompUnit::Repository::Tar of S22 (which is like 90% of a naive fatpack solution) cannot implement %?RESOURCES without copying those files to disk, even if everything else is capable of being loaded into memory. However, %?RESOURCES is perfectly suited for distributions that have a build step or need to include external libraries (Inline::Perl5, OpenSSL) which generally expect themselves and others to exist on disk -- thus these distributions could never be loaded by e.g. CompUnit::Repository::Tar anyway.

ugexe avatar Sep 25 '20 13:09 ugexe

"it's not totally clear if these resources are a part of the API or not"

I say, resources are a distribution's internal matter. Distributions may make these resources available through modules (e.g. through a simple function returning a specific resource or even passing through %?RESOURCES as is) but by default they are private. That's why it makes sense for %?RESOURCES to be available only in the lexical scope of modules that are part of the same distribution. In other words, it's an authors choice whether resources are part of a public API and if they are it's the author's responsibility to provide and interface for accessing them. Does this answer the question?

On a side note: while it's an unfortunate artifact of development history that the resources API is tied so closely to file systems, I don't think all hope is lost there. There are use cases where a file on the file system is just needed (like with bundled shared libraries). Repository implementations that are not file system based will always have the possibility to extract those resources into temporary files.

niner avatar Sep 25 '20 14:09 niner

You're going to pardon me for being a bit thick here, but you have just mentioned an use case where it would make a lot of sense to reuse someone else's resource: compiled shared library. Imagine you want to create a different entry point to a compiled shared library, without re-using the public interface of the class it's included in. You might (rightly) argue that there are several good options for doing so: subclassing, using public interface, or if you don't want the baggage, simply (being free software) rip it from the original and replicate it somewhere else. Any of those has its inconvenients: you might not want all the baggage that comes with subclassing, you might need something that's not available through the class public interface, or you simply don't want to maintain a copy of the shared library's source.

And please bear in mind that this is not a technical question. It's more of a pragmatic question. I don't know if there're other languages out there that include within its installed resources artifacts such as these. The thing is, once it does, the cat is out of the hat and people will start to use it back and forth. @niner, you say that resources are a distribution internal matter, but they are public metadata and people might want to reuse them, in the previously mentioned case or something we might not envision. So the question is again kinda theoretical; it might have (or not) a technical answer, but in principle it's just theory: I have some artifacts in my distribution. They are out there. People might see them and want to reuse them (through the available interface or simply downloading them from their location in GitHub). Should we adopt a convention (that might be implementation later) of keeping a part of those artifacts constant, at least within the major version? If so, what part?

I am quite grateful for your laying out the current scenario, but the fact that @tbrowder raised an issue along the same lines, and I do it now, implies that it might be raised again in the future.

Mind you, the decision might be "No, please don't rely on this part of the distribution, unless explicitly published through the public API of the different classes in the distribution", and that's totally legitimate. But I feel that we should go one way or the other.

And now my opinion on the subject. I think that we should deal with artifacts in the same way we deal with REST route names right now. A distribution usually maintains the route names (which are arguments to a function that describes the route) as part of the API. If you keep the functionality and assign it a different route, there will be a deprecation notice. You will still need to test for the functionality, but routes will keep the same name.

So I think that the existence and route to artifacts should be maintained within major/api versions. Please bear in mind that, in the same way that changing major versions with major API changes is not really enforced, this would not (and probably could not) really be enforced. But adding this to API stability would be a nice added value in our ecosystem.

JJ avatar Sep 25 '20 16:09 JJ

Isn't that basically the same argument for accessing any private (variable/method/file/resource/etc)? If a module has something internal like a nice chunk of code you'd like to reuse, just ask the author to make an interface to it.

Perhaps they didn't realize it would be useful outside their module. Perhaps they have other reasons for not making it available through an interface. Perhaps they don't want someone relying on a file they'd like to retain the right to later change or remove.

Lots of stuff starts as private by default - part of the module's internals. Some of it gets exposed via public interfaces later on if the author of the module decides to do so. If they are intransigent and refuse to provide the interface and you care, you can always fork it.

Lets look at it the other way. It is clearly easy (trivial) to add a public API to a resource, marking it as 'public'. If we default a resource to 'public' instead of 'private', how does the module author restrict it, marking it as 'private'?

CurtTilmes avatar Sep 25 '20 17:09 CurtTilmes

@CurtTilmes the problem is that it's not totally and absolutely clear that resources are private. A private attribute is a private attribute. There's a sigil in it and all. A resource is, well, called a "resource", not a "private resource" or "stuff we stash somewhere you don't see and you shouldn't look for". A resource looks like something a resourceful person might have some use for.

My point here it's that it's totally legitimate to think that way. However, I do think that, through clever naming or documented conventions, it should be crystal clear to everyone if it's that way or another one totally different.


Just to show another example what I'm talking about, check out my (Perl) Test::Text module. It uses affixing rules from Sublime Text, and dictionaries from Libreoffice. I rely on them staying on the same URL, or my module will break. They haven't changed so far, but it could.

That's bound to happen with many Raku modules. They will provide resources, and people might use them. If we conventionally decide we will advise on keeping them in the same place at least while the major version of the resource does not change, so be it. Eventually we might give them tooling to put that to use within Raku. If we decide that no, it's considered conventionally private, exactly the same. Document it, and maybe eventually add tooling for it too.

I guess that my point (and opinion) is clear now, but just in case I repeat it here: I don't think that's unambiguously clear right now, and, if left undecided, might cause some confusion (in the shape of issues and questions) into the future.

JJ avatar Sep 25 '20 17:09 JJ

There is no convenient way to access another dist's resources (as was discussed in https://github.com/rakudo/rakudo/issues/3821). If that isn't enough of a hint that you're poking into private parts, please document this convention. What kind of tools would be needed to keep the status quo?

I certainly don't see Inline::Perl5's resources as a public API and would have been very much surprised if it was one by some default or convention. I feel free to change them at any point without notice.

niner avatar Sep 25 '20 18:09 niner

my 2 cents here on convenience (sorry if a bit off topic ) as author of some modules that extensively uses resources it'd be cool to be able to install/use not just a file but a directory of resources at once ... i. g. ::RESOURCES-DIR

melezhik avatar Sep 28 '20 17:09 melezhik