jetty.project icon indicating copy to clipboard operation
jetty.project copied to clipboard

Allow non existant resources to be obtained from Resource.resolve()

Open lachlan-roberts opened this issue 1 year ago • 2 comments

Jetty version(s) 12

Description Currently Resource.resolve() will return null if the resource does not exist. This makes Resource harder to use (see case in PR #11279).

We have Resource.exists() method to determine whether a given resource exists, so why can't we get resources from resolve which do not exist?

The implementation of PathResource.resolve() does this:

if (Files.exists(path))
    return newResource(path, resolvedUri);
return null;

lachlan-roberts avatar Feb 15 '24 12:02 lachlan-roberts

I propose that we call this a "potential" Resources, defined as a Resource that does not yet exist, but can possibly change state in the future to existing.

Not all Resource implementations can experience this "potential" Resource behavior (keep reading for examples).

Anyway, some history I was able to recall (from checking the git history).

PathResourceFactory is from ...

Issue #8886 - support extensible Resource URI schemes PR https://github.com/jetty/jetty.project/pull/8888 Commit ee097316bb7556bf120d7d003c978b528898ccd6

It didn't exist until jetty-12.0.0.alpha3

The parent commit is 681e04d68aa00867b1714457e2d0d15caaa2be4d Which existed as part of jetty-12.0.0.alpha2

The origin of many of the null Resource behaviors is from PR https://github.com/jetty/jetty.project/pull/8702 This was an effort to stabilize the behaviors across the Resource codebase.

Back then we had some code paths that returned null, and others that didn't return null. And in many cases a Resource that was created from a base that doesn't exist now, and could never exist in the future.

Having a Resource instance exist with nothing backing it made some things difficult. The CompositeResource (now CombinedResource) could have entries that don't yet exist, then BAM they exist, and are not a directory, breaking the contract of CombinedResource. (If we allow "potential" Resource in the CombinedResource then this is a reality) Having an immutable CombinedResource is a good thing IMO. The Mount system could now have mounts that track things that point to content that does not exist (and could never exist).

When URLResource became a reality (in jetty-12.0.0.alpha3) this also took in all of the changes to URL protocol handlers that were introduced in Java 17, making URL quite a heavyweight concept every time it is accessed, making other tasks around those long lived Resource instances very problematic, especially so if we know the Resource could never exist in the future. Example: for OSGi and Graalvm based URLs, having a URLResource to something that doesn't exist was pointless as they would never exist in the future (due the nature of how those systems work). Having a MountedPathResource.resolve() return a non-existant Resource instance was also not beneficial (as a jar/zip mount couldn't change).

If we do undo these null behaviors, it should be highly selective to only those cases where the "potential" Resource could exist in the future, keeping the return null logic for those Resources that can never change, and never have a "potential" Resource.

We should look carefully if having the current behavior of resolve() just delegating to newResource() is still sane (are they similar? or do they have different contracts?)

The resource implementations should have the main knowledge to know if the resource can ever exist in the future or not. But there are other behaviors where we could tell the Resource implementation that we know this isn't going to change (eg: when MetaInfConfiguration unpacks WEB-INF/lib/*.jar!/META-INF/resources/ into the temp/work directories, and creates Resource objects of those new paths, those bases should have a hint telling them that they will not change and to return null for non-existent references)

joakime avatar Feb 15 '24 16:02 joakime

This null return is now making us write inefficient code, as we cannot resolve to "potential" resources at startup and thus have to resolve on every request. The other APIs for this do not suffer this problem.

Previously Resource.addPath would return non-null results for non-existent resources, as does Path.resolve(...) and new File(File, String) and URI.resolve(...). All these classes have existence tests in their API (of some form), so returning an object that does not exist is simpler and more flexible. There is no need for null checks by the caller (or it might want to do an existence check instead) and it allows logic to work on resources that might exist in future or that we might want to create ourselves.

It was a mistake to change this semantic and we should revert back to the pattern that we have always had and that matches the behaviours of Path, File, URI etc. . Any code written against those API's is in a race with the possible changing existence status and/or type of their targets. The code just has to be robust and handle the fact that an exception might be thrown before or during access to the resource.

Any code that cares about existence will be doing a null check already. That should be trivial to change to an existence check. Of course it can then be deleted a nanosecond later... but this has always been the case with all APIs.

There is nothing we can do to stop the resources listed in a CombinedResource from changing their existence status, or even changing from a directory to a non-directory. That code just needs to be robust enough to either ignore (or perhaps throw) if it detects and illegal state (e.g. one of it's resources has become a non-directory). If there are base resources that cannot change (e.g. Zip), then it should be trivial to return a simple Resource that records the location and always returns false for the existence check.

gregw avatar Feb 16 '24 09:02 gregw

I have started a PR for this. It turns out that we have a fair bit of code that still expected a non null return from resolve and some resource types that would return non null non-existent resources. Much code calling resolve does not need to be updated as it is calling methods like Resources.missing which handles both null and !exists()

gregw avatar Mar 01 '24 09:03 gregw