javacpp
javacpp copied to clipboard
Integration of JavaCPP fails for Eclipse/OSGi application
I'm trying to integrate bytedeco/cpython into an Eclipse based application (which uses the Equinox OSGi runtime, like all other Eclipse applications too). Even tough I tried multiple different approaches to integrate cpython and javacpp, but non of them was successful:
- include it via the Target-Platform
- embed the main jars directly into a Eclipse plugin and the platform specific jars in corresponding fragments
- embed the unpacked content of the jars in corresponding Eclipse plug-ins/fragments
All suffered the same problem: the javacpp.Loader
cannot handle URL's using the bundleresource
schema (like bundleresource://491.fwk108568334:2/org/bytedeco/javacpp/windows-x86_64/api-ms-win-crt-locale-l1-1-0.dll
) properly.
These URLs are returned by the EquinoxClassLoader used by Eclipse when the javacpp.Loader
quires resources in its findResources()
methods.
With those URLs the caching of packages fails and the libraries cannot be loaded.
Therefore my question is: Is it possible to use javacpp and its preset in an Eclipse/OSGi application? And if yes are there some best-practices how to integrate them?
I noticed you have a osgi
branch where some contributions were made some time ago, but it seems to be stale and I could not find information how to use the current main artefacts in Eclipse/OSGi.
If it is not possible yet, one solution to solve the problem mentioned above could be to include org.eclipse.platform:org.eclipse.osgi as optional dependency and use the org.eclipse.osgi.storage.url.BundleURLConverter
to resolve bundleresource
URLs if that class is available at runtime (if it cannot be loaded it is likely that it is not required).
If the suggested change is suitable and wanted, I can create a PR for it.
I'm not sure this is the onyl required step to enable the use of javacpp in Eclipse but, it seems like it is a necessary first one.
The presets currently don't come with anything for OSGi, so users like @reckart simply create their own uber JAR bundle out of whatever they need, and it works fine.
If you would like to contribute changes to enhance OSGi support, that would be great! Thanks
/cc @timothyjward
It works for me to:
- install JavaCPP as bundles via the target platform (1.9.4 works currently best)
- install Apache Aries SPI loader via the target platform
- package up any jar libs and native libs (e.g. DL4J) into a fat bundle
Thanks for describing your setup. But I would like to to skip the creation of a fat jar and just include javacpp and some of the presets form my Eclipse PDE Target-Platform. The Eclipse m2e-pde integration is used to include bundles from Maven-central and to wrap Maven artifacts that are not yet OSGi-bundles.
The only things that prevents me from using this is that the javacpp.Loader cannot handle bundleresource
URLs which are returned by the EquinoxClassloader
used by Eclipse and that the javacpp.presets package is not exported in the 1.5.5 release. The latter is already fixed with PR #490. I don't think the including Aries SPI helps in this manner.
For the first problem I have created a solution based on what I have described initially and pushed it to my javacpp fork: https://github.com/HannesWell/javacpp/tree/bundleresourceURLsupport
But while thinking again about my use case I got came to the conclusion that for us it would be better to be able to fully intercept the search for resources of the Loader and to be able to control the resource search of the loader. For this approach I created another branch in my fork: https://github.com/HannesWell/javacpp/tree/resourceResolver
This would have the additional advantage that javacpp does not have any dependency to org.eclipse.osgi
and it gives users more opportunities to enhance the search for resources. On the other hand javacpp would not run under OSGi out of the box and some configuration is needed. But if it is documented well I don't consider this as a problem.
Furthermore I consider to use Java's ServiceLoader API instead of providing a static registry. This would avoid the requirement to register the resolver before the Loader class itself is loaded, which could be a bit tricky when used from multiple locations.
Anyway, if you agree with the second approach I can create a PR and we can discuss the details there.
The presets currently don't come with anything for OSGi, so users like @reckart simply create their own uber JAR bundle out of whatever they need, and it works fine.
That's not a blocker for me, since the mentioned Eclipse m2e-pde integration is capable of wrapping Maven artifacts as OSGi bundles. However it would reduce the configuration effort for any of the presets if the presets would also have OSGi Metadata included. Since this was done for javacpp in PR #332 I think it should be possible for javacpp-presets as well. But I think this is an issue for the presets repo.
I'm pretty sure @timothyjward got the bit for bundleresource
working, but if there's anything wrong with it, let's fix it, sure.
@HannesWell Why do we need to have dependencies on external APIs to load Java resources? Could you elaborate on this? I thought the JDK had all the necessary interfaces that properly abstracted all that away for any container we may be running on.
That's not a blocker for me, since the mentioned Eclipse m2e-pde integration is capable of wrapping Maven artifacts as OSGi bundles. However it would reduce the configuration effort for any of the presets if the presets would also have OSGi Metadata included. Since this was done for javacpp in PR #332 I think it should be possible for javacpp-presets as well. But I think this is an issue for the presets repo.
Yes, like I said, contributions welcome! @timothyjward was supposed to start working on that, but that didn't happen, so you could pick it up from there.
@HannesWell I started looking at this a bit more closely. What is the class of the URLConnection
that we get when calling URL.openConnection()
on a "bundleresource"? If it's a subclass of JarURLConnection
, there shouldn't be any need to resolve it...
I'm pretty sure @timothyjward got the bit for
bundleresource
working, but if there's anything wrong with it, let's fix it, sure.@HannesWell Why do we need to have dependencies on external APIs to load Java resources? Could you elaborate on this?
From the tests created by @timothyjward I assume he's using the Apache Felix
OSGi implementation. But I'm using javacpp/cpython in an Eclipse application, which uses the Eclipse implementation of OSGi called Equinox.
When running Equinox based applications usually the org.eclipse.osgi.internal.loader.EquinoxClassLoader
is used. With this EquinoxClassLoader
the methods Class.getResource(String)
and ClassLoader.getResources(String)
, which are used in javacpp.Loader.findResources(Class, String, int)
, return URLs that use the mentioned bundleresource
schema.
I'm not familiar with Apache Felix, but I assume that the URLs returned by Apache Felix
use schemas that are native to the Java Platform, like file
, jar
or http(s)
.
URLs with a schema unknown to the Loader then cause problems in Loader.cacheResource(URL,String)
because in the end folders cannot be extracted, only single files but without knowing the content of a folder a file to extract cannot be specified, so native libraries are missing and therefore cannot be loaded.
I thought the JDK had all the necessary interfaces that properly abstracted all that away for any container we may be running on.
@HannesWell I started looking at this a bit more closely. What is the class of the
URLConnection
that we get when callingURL.openConnection()
on a "bundleresource"? If it's a subclass ofJarURLConnection
, there shouldn't be any need to resolve it...
When calling URL.openConnection()
on a "bundleresource" URL we get a org.eclipse.osgi.storage.url.BundleURLConnection
. This class inherits directly from URLConnection
and therefore does not provide more 'native' data. I think this is because a bundle can have both shapes: packed into a jar or unpacked as a directory.
So it seems like in this case not everything is abstracted away properly or at least Equinox is not very cooperative in this use case.
When opening a stream for a bundleresource URL that denotes a directory the stream is also empty (in some implementations it contains a list of files contained in that directory).
So I don't see a suitable way how to enable using javacpp in an Eclipse Equinox environment without an optional/provided dependency to org.eclipse.osgi
.
I wonder why it is working for @reckart, so it would be great if he can provide more details. Because even when I pack the java-cpp preset jars into a fat bundle (i.e. Eclipse plug-in) I got the same issues as described above.
I can't provide you with access to the code, but if there is something specific I should look for and report, please ask.
Note, it works for us when running in Eclipse (Equinox) as well as when running in Karaf (Felix I believe).
However for me it would be even better if we could fully intercept the resource search in a generic approach, even tough it goes beyond the scope of this issue: We use a special custom resource locator that is capable to find resources in Eclipse plug-ins in both situations, when they run in a Equinox environment and when they are used in a JUnit test started within the Eclipse IDE. In the latter case it is a pure Java runtime. This is a bit hacky but it works well for our use cases.
I already have created an exemplary commit using a static ResourceResolverRegistry
here.
But using a static factory has the drawback that one has to make sure that a resolver is registered before the Loader is initialized.
Using a java.util.ServiceLoader
would avoid the need to do is and is therefore more suitable in my opinion.
For this second approach I have created a separated branch, too: https://github.com/HannesWell/javacpp/tree/resourceResolverService
This implements the logic around the ServiceLoader and prepares its use in an OSGi application.
Suppose one wants to provide an own ResourceResolver
named org.example.MyResolver
, these are the steps to perform:
In general:
- Create the class
org.example.MyResolver
implementingorg.bytedeco.javacpp.Loader.ResourceResolver
- Create the provider-configuration file
META-INF/services/org.bytedeco.javacpp.Loader$ResourceResolver
in your jar with the following content:
org.example.MyResolver
More information can be found in the Java-Doc of java.util.ServiceLoader
.
When you want to provide an own ResourceResolver service in an OSGi application the following steps have to be performed additionally: 3. In the Manifest.MF of the bundle containing the provided ResourceResolver implementation add the flowing directives:
Require-Capability: osgi.extender;filter:="(osgi.extender=osgi.serviceloader.registrar)"
Provide-Capability: osgi.serviceloader;osgi.serviceloader=org.bytedeco.javacpp.Loader$ResourceResolver
- Add an implementation of the OSGi Service Loader Mediator Specification to your application. The SPI fly component of the Apache Aries library is one possible implementation. This comes in two versions a static and a dynamic one. I used the dynamic one, because it does not require to further modify the jars. The Maven coordinates to the corresponding dependency are:
<dependency>
<groupId>org.apache.aries.spifly</groupId>
<artifactId>org.apache.aries.spifly.dynamic.bundle</artifactId>
<version>1.3.4</version>
</dependency>
- Make sure the org.apache.aries.spifly.dynamic.bundle is started automatically by your OSGi application.
For an Eclipse product this can be ensured in the Configuration tab of the Product Editor by adding
org.apache.aries.spifly.dynamic.bundle
with start level0
(default) andAuto-Start=true
.
When one uses the static version of Apache Aries SPI fly the steps 4 and 5 are maybe not necessary
More information about the OSGi Service Loader Mediator Service can be found under the following links:
- https://blog.osgi.org/2013/02/javautilserviceloader-in-osgi.html
- https://docs.osgi.org/specification/osgi.enterprise/7.0.0/service.loader.html More information about Apache Aries SPI fly can be found here:
- https://aries.apache.org/documentation/modules/spi-fly.html
This actually goes beyond the scope of this issue but for me it would be sufficient. However it it could also be used to resolve this issue in its actual intention: make javacpp work in an Eclipse-Equinox OSGi environment.
For example one could provide a corresponding EquinoxBundleResourceResolver
using the convenient method FileLocator.resolve(URL)
from org.eclipse.core.runtime
:
import java.io.IOException;
import java.net.URL;
import java.util.Iterator;
import org.bytedeco.javacpp.Loader.ResourceResolver;
import org.eclipse.core.runtime.FileLocator;
public class EquinoxBundleResourceResolver implements ResourceResolver {
@Override
public URL resolveResource(Class clazz, String name) throws IOException {
return resolve(clazz.getResource(name));
}
@Override
public Iterator<URL> resolveResources(ClassLoader classLoader, String name) throws IOException {
return classLoader.resources(name).map(EquinoxBundleResourceResolver::resolve).iterator();
}
private static URL resolve(URL resource) {
try {
return FileLocator.resolve(resource);
} catch (IOException e) {
throw new IllegalArgumentException(e);
}
}
}
Javacpp could provide the description above as an article/wiki entry, so users that require it can implement it by themself accordingly or javacpp could create a separate javacpp.eclipse.osgi Maven module that provides the EquinoxBundleResourceResolver, has the corresponding MANIFEST.MF directives and pulls in the org.apache.aries.spifly.dynamic.bundle
. Then users in Equinox would only have to add javacpp.eclipse.osgi and make sure that org.apache.aries.spifly.dynamic.bundle is started automatically.
Thanks for your quick reply @reckart.
I can't provide you with access to the code, but if there is something specific I should look for and report, please ask.
Note, it works for us when running in Eclipse (Equinox) as well as when running in Karaf (Felix I believe).
That's interesting and I'm puzzled how this works for you under Equinox.
What would be very interesting for me is, what kind of URL, especially the schema/protocol, is returned by cls.getResource(String) or classLoader.getResources()
in javacpp.Loader.findResources(Class, String, int)
? Is this also a bundleresource
URL?
If yes, how does the javacpp.Loader handles them when it comes to unpacking the jars and did you configure something special so it can be handled? And if it is not a bundleresource URL, how did you manage to get another URL?
I've bundled the main artefacts of javacpp and the presets for cpython, numpy and openblas in one plug-in and the platform specific artefacts in corresponding fragments for the main-plug-in (we support Windows and Linux). Furthermore we use a stripped version of the embedded-python Python class. Additionally, in order to make the Loader work with bundleresource URLs, I embedded the Loader in patched, that looks like the one IÄve pushed here: https://github.com/HannesWell/javacpp/tree/bundleresourceURLsupport
Have you build your jars in a different manner completely different, that it works for you?
I'm not familiar with Apache Felix, but I assume that the URLs returned by Apache Felix use schemas that are native to the Java Platform, like file, jar or http(s).
This is incorrect. Felix also returns URLs with a custom scheme, also called bundleresource
but not the same implementation as Equinox has, and definitely not interchangeable.
When calling URL.openConnection() on a "bundleresource" URL we get a org.eclipse.osgi.storage.url.BundleURLConnection. This class inherits directly from URLConnection and therefore does not provide more 'native' data. I think this is because a bundle can have both shapes: packed into a jar or unpacked as a directory.
There are a variety of reasons, but yes, most frameworks support exploded format bundles. You definitely can't rely on the URLConnection
being a particular implementation-independent type.
I've bundled the main artefacts of javacpp and the presets for cpython, numpy and openblas in one plug-in and the platform specific artefacts in corresponding fragments for the main-plug-in (we support Windows and Linux). Furthermore we use a stripped version of the embedded-python Python class. Additionally, in order to make the Loader work with bundleresource URLs, I embedded the Loader in patched, that looks like the one IÄve pushed here: https://github.com/HannesWell/javacpp/tree/bundleresourceURLsupport
Have you build your jars in a different manner completely different, that it works for you?
At a guess you haven't added the Bundle-NativeCode header to your OSGi bundle? This is the thing which tells OSGi (in the absence of JavaCPP) where your native code lives and causes it to be properly loaded as needed. If you add this then do things start working properly?
Have you build your jars in a different manner completely different, that it works for you?
It actually works fine for cases where we don't need to extract directories. The API provided by URLConnection
allows us to access the data of specific files such as DLLs, which is all we need for typical native libraries like OpenBLAS, but CPython and NumPy are not typical native libraries. They require a large amount of additional files to function. (BTW, if you don't need CPython and NumPy specifically, you may want to try out the C++ API of PyTorch, which provides most of the functionality of NumPy, but without requiring CPython: https://github.com/bytedeco/javacpp-presets/tree/master/pytorch.)
Ok, so this is getting complicated. I don't think coming up with a plugin system of sorts is the right way to go. We have a similar problem in the case of GraalVM Native Image and jlink, and although it doesn't look like there is a way to list resources for the former yet, according to @AlanBateman https://bugs.jython.org/msg12173, it's possible to do it for the later with something like this using FileSystems
:
FileSystem jrtfs = FileSystems.getFileSystem(URI.create("jrt:/"));
Path top = jrtfs.getPath("/");
Files.walk(top).forEach(System.out::println);
If we can use FileSystems
in the case of OSGi as well, let's do that instead of reinventing the wheel.
@HGuillemet Could you update Loader.extractResource()
to try and extract directories using FileSystems
?
@HannesWell
That's interesting and I'm puzzled how this works for you under Equinox. What would be very interesting for me is, what kind of URL, especially the schema/protocol, is returned by cls.getResource(String) or classLoader.getResources() in javacpp.Loader.findResources(Class, String, int)? Is this also a bundleresource URL?
This is what I get in Loader:862 -- URL url = cls.getResource(name);
when running in Eclipse/Equinox.
bundleresource://12.fwk1545841215:3/org/bytedeco/javacpp/macosx-x86_64/libjnijavacpp.dylib
I'm not quite sure why I'm receiving notifications from this issue but just to say that the jrt file system provider is for accessing resources in the run-time image. IDEs and some tools use it to access the class files or other resources in the current or remote run-time image.
If I understand the issue here then there is a custom URL stream handler somewhere that uses the "bundleresource" scheme. The 4-arg java.net.URL constructor specifies how URL stream handlers are located. For Eclipse/Equinox then I would expect deploying a URLStreamHandlerProvider implementation on the class path would be the simplest. URL uses ServiceLoader to locate custom URLStreamHandlerProvider implementations on the class path.
I'm not familiar with Apache Felix, but I assume that the URLs returned by Apache Felix use schemas that are native to the Java Platform, like file, jar or http(s).
This is incorrect. Felix also returns URLs with a custom scheme, also called
bundleresource
but not the same implementation as Equinox has, and definitely not interchangeable.
Thanks for this clarification.
At a guess you haven't added the Bundle-NativeCode header to your OSGi bundle? This is the thing which tells OSGi (in the absence of JavaCPP) where your native code lives and causes it to be properly loaded as needed. If you add this then do things start working properly?
You're right, I haven't specified the Bundle-NativeCode
header but I suppose javacpp is in charge of managing and loading the jars and I should not have do it manually.
This is what I get in
Loader:862 -- URL url = cls.getResource(name);
when running in Eclipse/Equinox.bundleresource://12.fwk1545841215:3/org/bytedeco/javacpp/macosx-x86_64/libjnijavacpp.dylib
Thanks. That's the same for me.
Have you build your jars in a different manner completely different, that it works for you?
It actually works fine for cases where we don't need to extract directories. The API provided by
URLConnection
allows us to access the data of specific files such as DLLs, which is all we need for typical native libraries like OpenBLAS, but CPython and NumPy are not typical native libraries. They require a large amount of additional files to function.
I think that's the key-point of the problem. Extraction of directories does not work with Equinox-OSGi.
(BTW, if you don't need CPython and NumPy specifically, you may want to try out the C++ API of PyTorch, which provides most of the functionality of NumPy, but without requiring CPython: https://github.com/bytedeco/javacpp-presets/tree/master/pytorch.)
Thanks for the hint, but we want to use python and numpy as part of that.
Ok, so this is getting complicated. I don't think coming up with a plugin system of sorts is the right way to go. We have a similar problem in the case of GraalVM Native Image and jlink, and although it doesn't look like there is a way to list resources for the former yet, according to @AlanBateman https://bugs.jython.org/msg12173, it's possible to do it for the later with something like this using
FileSystems
:FileSystem jrtfs = FileSystems.getFileSystem(URI.create("jrt:/")); Path top = jrtfs.getPath("/"); Files.walk(top).forEach(System.out::println);
If we can use
FileSystems
in the case of OSGi as well, let's do that instead of reinventing the wheel.@HGuillemet Could you update
Loader.extractResource()
to try and extract directories usingFileSystems
?
Yes indeed it is probably to complicated.
Your suggestion using FileSystems
sounds interesting. But unfortunately it seems like Equinox does not provide a FileSystem for the bundleresource
schema. And implementing an own one looks way more complicated when I look into the FileSystem
and FileSystemProvider
sources.
However I would be satisfied with a 'simple' ResourceResolver registry like I have prototyped here: https://github.com/HannesWell/javacpp/commit/d239b230570d20c41f92b16850d9dc83db623c6f
@HannesWell in our code (where it works), we do have a <Bundle-NativeCode>
section in the POM, an <Embed-Dependency>...inline=org/bytedeco/**/*.dylib....</ >
and also a <SPI-Provider>
for the Nd4JBackend
and NDArrayCompressor
.
I think that's the key-point of the problem. Extraction of directories does not work with Equinox-OSGi.
If this is the goal then realistically JavaCPP will need some OSGi aware code which gets hold of the bundle containing the native code (based on the classloader being a BundleReference
) and then uses getEntryPaths to find the files.
Before going down this route I would like to be sure that setting the Bundle-NativeCode
header isn't enough to get it working. The reason for this is that JavaCPP would have to start tracking bundles to invalidate their cached libraries if they are uninstalled/updated, whereas right now I'm pretty sure OSGi is managing the library loading and making that a non-issue.
We can avoid OSGi or framework-specific code in JavaCPP if we use some service loader mechanism: adding a jar providing the service for the framework actually used to the class path or module path would be enough.
We can either use a custom service, like @HannesWell suggests, but would something simple that transforms a URL with the unknown scheme to a standard (file: or jar:) not be enough ? For Eclipse, the implementation would just call FileLocator.resolve
.
Or we use an existing standard service: FileSystem
as suggested by @saudet. This option would require to change JavaCPP to use FileSystem
instead of using jar or jrt-specific code, and to implement a FileSystem
for the Eclipse resourcebundle:
scheme. This is a bit complicated indeed, but we do not need the full features of a file system. We need to be able to list files in a directory, obtaining a file size and a last-modified date.
We can avoid OSGi or framework-specific code in JavaCPP if we use some service loader mechanism: adding a jar providing the service for the framework actually used to the class path or module path would be enough.
This is unlikely to work well. In an OSGi framework there isn’t a classpath or module path to use for this. Each bundle has a classloader and those are wired in a graph to delegate loading for certain packages (as defined by Import-Package in the bundle). ServicLoader will therefore not see anything to load unless carefully managed.
We can either use a custom service, like @HannesWell suggests, but would something simple that transforms a URL with the unknown scheme to a standard (file: or jar:) not be enough ? For Eclipse, the implementation would just call
FileLocator.resolve
.
This is approximately what I’m suggesting as “OSGi aware code”, but I would recommend using the OSGi api over the Equinox API so that it works in Felix too.
Or we use an existing standard service:
FileSystem
as suggested by @saudet. This option would require to change JavaCPP to useFileSystem
instead of using jar or jrt-specific code, and to implement aFileSystem
for the Eclipseresourcebundle:
scheme. This is a bit complicated indeed, but we do not need the full features of a file system. We need to be able to list files in a directory, obtaining a file size and a last-modified date.
This would also be some OSGi aware code to provide a Filesystem on top of a bundle.
I'm not quite sure why I'm receiving notifications from this issue but just to say that the jrt file system provider is for accessing resources in the run-time image. IDEs and some tools use it to access the class files or other resources in the current or remote run-time image.
If I understand the issue here then there is a custom URL stream handler somewhere that uses the "bundleresource" scheme. The 4-arg java.net.URL constructor specifies how URL stream handlers are located. For Eclipse/Equinox then I would expect deploying a URLStreamHandlerProvider implementation on the class path would be the simplest. URL uses ServiceLoader to locate custom URLStreamHandlerProvider implementations on the class path.
@AlanBateman Thanks for pitching it! I'm sorry I didn't make the reason clear as to why I mentioned you here. I'm under the impression that FileSystems
is the "new" preferred mechanism that should be used to access resources from wherever they come from, and it's already supported everywhere in the JDK, including for images generated by jlink, so specifications like OSGi should also provide support. Does that sound about right?
Yes indeed it is probably to complicated. Your suggestion using
FileSystems
sounds interesting. But unfortunately it seems like Equinox does not provide a FileSystem for thebundleresource
schema. And implementing an own one looks way more complicated when I look into theFileSystem
andFileSystemProvider
sources.However I would be satisfied with a 'simple' ResourceResolver registry like I have prototyped here: HannesWell@d239b23
@HannesWell From what I understand, that approach works only for JAR files, and not for native images from jlink, GraalVM, or what have you, am I right? In the case of OSGi though, if we're fine with supporting JAR files only for now, how about we put those few lines of code (that or what @timothyjward suggested) in JavaCPP but using reflection instead to avoid adding dependencies or expanding the API? It's not like loading libraries is any sort of a bottleneck, and it would also save us (read me) from maintaining an API that's not general enough anyway.
Or we use an existing standard service:
FileSystem
as suggested by @saudet. This option would require to change JavaCPP to useFileSystem
instead of using jar or jrt-specific code, and to implement aFileSystem
for the Eclipseresourcebundle:
scheme. This is a bit complicated indeed, but we do not need the full features of a file system. We need to be able to list files in a directory, obtaining a file size and a last-modified date.
@HGuillemet It looks to me like using FileSystems
in JavaCPP would only be a few lines of code, so it doesn't sound like that big of a deal, and it's the only API available to access images from jlink anyway, so I'm leaving it up to you if you want to implement it or not, regardless of OSGi. :)
We can avoid OSGi or framework-specific code in JavaCPP if we use some service loader mechanism: adding a jar providing the service for the framework actually used to the class path or module path would be enough.
This is unlikely to work well. In an OSGi framework there isn’t a classpath or module path to use for this.
Ok, I see than Service loading is not really compatible with OSGi and would need some complex intermediary code. Let's set aside the service loading options.
Another option is to not use JavaCPP at all for extracting native libs but use the OSGi feature based on Bundle-NativeCode
that you mention. So basically you would extract Javacpp native jars in directories of your choice, reference the directories in Bundle-Native-Code
header and bundle them. However JavaCPP does not only unzip the jar when extracting native libs: it can reorganize it (flatten or not), create symbolic links, and even do some preset-specific magic like openblas does. Something that can be tested is thus to:
- empty the .javacpp cache dir
- run some similar app using he same javacpp dependencies but outside OSGi
- copy the content of .javacpp directory, reference the native libs in
Build-NativeCode
and bundle them with the rest.
Maybe the javacpp.cachedir
property should also be set to the configuration cache of the framework, but I'm not sure.
What would me missing is a JavaCPP tools to facilitate the process and avoid to have to run some sample app first to populate the JavaCPP cache directory.
This approach would also be useful in another context when building an app to be distributed as a jlink standalone image in order to package a pre-extracted directory of native libs instead of poluting the user home directory with copies of the extracted libs.
@saudet, what do you think about it ?
@AlanBateman Thanks for pitching it! I'm sorry I didn't make the reason clear as to why I mentioned you here. I'm under the impression that
FileSystems
is the "new" preferred mechanism that should be used to access resources from wherever they come from, and it's already supported everywhere in the JDK, including for images generated by jlink, so specifications like OSGi should also provide support. Does that sound about right?
The "jrt" file system provider was created to avoid defining a new API for tools/IDEs that need to locate class files and other resources in a run-time image, e.g. an IDE running on JDK 8 may have opened a project that is configured to use JDK 11 or JDK 17 so code in the IDE needs to access these run-time images to discover the modules/classes. As regards whether its the "preferred mechanism" for accessing resources then probably not. Applications or libraries will continue to use Class::getResourceXXX to locate the resources in their own library/module and ClassLoader::getResourceXXX to scan the class path for resources.
I don't think I can help with the issue here. All I can say is that if there are bundleresource:/ URLs in the picture then these URLs can only be opened if there a corresponding URL stream handler. Some OSGi implementations are known to hack into the JDK to register their URL stream handlers, something that probably broke with Java 16 as URL internals are no longer accessible. This is why I mentioned the URLStreamHandlerProvider class as that is the supported way to deploy URL stream handle factories for other protocols, it's much better than setting the system-wide factory.
What would me missing is a JavaCPP tools to facilitate the process and avoid to have to run some sample app first to populate the JavaCPP cache directory.
This approach would also be useful in another context when building an app to be distributed as a jlink standalone image in order to package a pre-extracted directory of native libs instead of poluting the user home directory with copies of the extracted libs.
@saudet, what do you think about it ?
@HGuillemet Sure, that's already possible. Try to run mvn org.bytedeco:javacpp:cache
, for example, inside https://github.com/bytedeco/javacpp-presets/tree/master/scipy/samples where you should get the following output:
[INFO] ------------------< org.bytedeco.scipy:sparselinalg >-------------------
[INFO] Building sparselinalg 1.5.6-SNAPSHOT
[INFO] --------------------------------[ jar ]---------------------------------
[INFO]
[INFO] --- javacpp:1.5.6-SNAPSHOT:cache (default-cli) @ sparselinalg ---
[INFO] Caching class org.bytedeco.cpython.presets.python
[INFO] Caching class org.bytedeco.numpy.presets.numpy
[INFO] Caching class org.bytedeco.scipy.presets.scipy
Info: Loading class org.bytedeco.javacpp.presets.javacpp
Info: Loading class org.bytedeco.openblas.presets.openblas_nolapack
Info: Loading class org.bytedeco.openblas.presets.openblas
Info: Loading class org.bytedeco.cpython.presets.python
Info: Loading class org.bytedeco.cpython.python
Info: Loading class org.bytedeco.numpy.presets.numpy
Info: Loading class org.bytedeco.scipy.presets.scipy
Info: Loading class org.bytedeco.mkl.presets.mkl_rt
PATH=/home/saudet/.javacpp/cache/javacpp-1.5.6-SNAPSHOT-linux-x86_64.jar/org/bytedeco/javacpp/linux-x86_64:/home/saudet/.javacpp/cache/openblas-0.3.15-1.5.6-SNAPSHOT-linux-x86_64.jar/org/bytedeco/openblas/linux-x86_64:/home/saudet/.javacpp/cache/cpython-3.9.6-1.5.6-SNAPSHOT-linux-x86_64.jar/org/bytedeco/cpython/linux-x86_64:/home/saudet/.javacpp/cache/numpy-1.21.0-1.5.6-SNAPSHOT-linux-x86_64.jar/org/bytedeco/numpy/linux-x86_64:/home/saudet/.javacpp/cache/scipy-1.7.0-1.5.6-SNAPSHOT-linux-x86_64.jar/org/bytedeco/scipy/linux-x86_64:/home/saudet/.javacpp/cache/mkl-2021.3-1.5.6-SNAPSHOT-linux-x86_64.jar/org/bytedeco/mkl/linux-x86_64
PACKAGEPATH=/home/saudet/.javacpp/cache/cpython-3.9.6-1.5.6-SNAPSHOT-linux-x86_64.jar/org/bytedeco/cpython/linux-x86_64:/home/saudet/.javacpp/cache/numpy-1.21.0-1.5.6-SNAPSHOT-linux-x86_64.jar/org/bytedeco/numpy/linux-x86_64/python:/home/saudet/.javacpp/cache/scipy-1.7.0-1.5.6-SNAPSHOT-linux-x86_64.jar/org/bytedeco/scipy/linux-x86_64/python
(We can also then put those paths in, respectively, PATH and PYTHONPATH to use them from the command line.)
The "jrt" file system provider was created to avoid defining a new API for tools/IDEs that need to locate class files and other resources in a run-time image, e.g. an IDE running on JDK 8 may have opened a project that is configured to use JDK 11 or JDK 17 so code in the IDE needs to access these run-time images to discover the modules/classes. As regards whether its the "preferred mechanism" for accessing resources then probably not. Applications or libraries will continue to use Class::getResourceXXX to locate the resources in their own library/module and ClassLoader::getResourceXXX to scan the class path for resources.
@AlanBateman Well, JavaCPP is a "tool" that needs to locate files, but those files are not "class files". They are instead other kinds of files, such as native libraries, but also anything else that is required by them, such as source code written in Python for CPython, NumPy, SciPy, etc or even in JavaScript, for that matter, in the case of frameworks like Node.js. Could you please explain how to accomplish that with Class::getResourceXXX and ClassLoader::getResourceXXX? And if it's possible to "scan" everything with these 2 methods, why can't IDEs use them? Why come up with a whole new "jrt" thing when it's not needed? I haven't found a way to use those methods, for example, to scan a subdirectory. How are we meant to accomplish what mvn org.bytedeco:javacpp:cache
does if not with "jrt"?
@AlanBateman Well, JavaCPP is a "tool" that needs to locate files, but those files are not "class files". They are instead other kinds of files, such as native libraries, but also anything else that is required by them, such as source code written in Python for CPython, NumPy, SciPy, etc or even in JavaScript, for that matter, in the case of frameworks like Node.js. Could you please explain how to accomplish that with Class::getResourceXXX and ClassLoader::getResourceXXX? And if it's possible to "scan" everything with these 2 methods, why can't IDEs use them? Why come up with a whole new "jrt" thing when it's not needed? I haven't found a way to use those methods, for example, to scan a subdirectory. How are we meant to accomplish what
mvn org.bytedeco:javacpp:cache
does if not with "jrt"?
If the "jrt" file system works for you then great, it gives you a file system view of a run-time image so you can walk directory files, list files in directories, and so on. Note the is not limited to the current run-time. As I said, a tool or IDE running on JDK X may need to introspect the classes/resources into another JDK Y on the file system. This is very different to the Class.getXXX which are a standard API for application or library to locate its own resources. Resources just have a name, there is no notion of directories in the resource API.
For shared libraries then you'll need to extract to the local file system to be useful, but I assume you know that already.
As a workaround for classpath walking in cases where the typical classpath scanning libraries do not work, I do this:
- create a "catalog" file at a well known location, e.g. at
/META-INF/catalog
and write into that file the paths of all the resource files that I would like to discover - use
ClassLoader.getResources()
to locate all the/META-INF/catalog
- optionally filter the content of the
catalog
files according to some search/path pattern - at this point I then know the paths of all the files I am interested in and then I can use
ClassLoader.getResource()
to load them one by one
So if I understand the discussion correctly, then it should be possible to generate such a file at build time (e.g. using a custom Maven plugin), place it into the Maven artifacts which contain the native libs/resources and make the loader consult this file in order to determine what to extract into the cache instead of having the loader rely on classpath scanning.
If the "jrt" file system works for you then great, it gives you a file system view of a run-time image so you can walk directory files, list files in directories, and so on. Note the is not limited to the current run-time. As I said, a tool or IDE running on JDK X may need to introspect the classes/resources into another JDK Y on the file system. This is very different to the Class.getXXX which are a standard API for application or library to locate its own resources. Resources just have a name, there is no notion of directories in the resource API.
Ok, good, thanks! I just wanted to make sure there was no other way of doing what we need to do here in the case of jlink images.
For shared libraries then you'll need to extract to the local file system to be useful, but I assume you know that already.
Yes, JavaCPP does that to make resources accessible to native libraries, which is something that should be part of the JDK, because then everyone could reuse the same code instead of coming up with their own hacks, but I've gone over and over again these things with Panama, and it never leads anywhere, so let's not have a discussion about this. :)
So if I understand the discussion correctly, then it should be possible to generate such a file at build time (e.g. using a custom Maven plugin), place it into the Maven artifacts which contain the native libs/resources and make the loader consult this file in order to determine what to extract into the cache instead of having the loader rely on classpath scanning.
Sure, that's a valid way to work around these usability issues of the JDK, but that information is already in JAR files as well as in the images generated by jlink and GraalVM. Why not use that information? From what I understand of what @AlanBateman says, FileSystems
is meant to be the way to access that information.
So we've got 3 options here:
- Make OSGi "jrt compliant" by implementing a backend for
FileSystems
or whatever it is that we actually need to do, and use that in JavaCPP, which will also make it work with images from jlink and maybe even GraalVM in the future - Implement a hack like putting lists of resources in a "catalog" file or something
- Implement many many container-specific hacks, such as with the resolver of OSGi
Options 3 implemented with reflection is probably the easiest thing to do to get something working with OSGi for now, so if that sounds OK to you @HannesWell please send a pull request with the updated code and I'll be happy to merge that!
I think that's the key-point of the problem. Extraction of directories does not work with Equinox-OSGi.
If this is the goal then realistically JavaCPP will need some OSGi aware code which gets hold of the bundle containing the native code (based on the classloader being a
BundleReference
) and then uses getEntryPaths to find the files.Before going down this route I would like to be sure that setting the
Bundle-NativeCode
header isn't enough to get it working. The reason for this is that JavaCPP would have to start tracking bundles to invalidate their cached libraries if they are uninstalled/updated, whereas right now I'm pretty sure OSGi is managing the library loading and making that a non-issue.
Setting the Bundle-NativeCode
header is very likely not sufficient for the python presets because many more other files that are not native library files (like some .py files) have to be cached.
We can either use a custom service, like @HannesWell suggests, but would something simple that transforms a URL with the unknown scheme to a standard (file: or jar:) not be enough ? For Eclipse, the implementation would just call
FileLocator.resolve
.This is approximately what I’m suggesting as “OSGi aware code”, but I would recommend using the OSGi api over the Equinox API so that it works in Felix too.
Agree on that.
Another option is to not use JavaCPP at all for extracting native libs but use the OSGi feature based on
Bundle-NativeCode
that you mention. So basically you would extract Javacpp native jars in directories of your choice, reference the directories inBundle-Native-Code
header and bundle them. However JavaCPP does not only unzip the jar when extracting native libs: it can reorganize it (flatten or not), create symbolic links, and even do some preset-specific magic like openblas does. Something that can be tested is thus to:* empty the .javacpp cache dir * run some similar app using he same javacpp dependencies but outside OSGi * copy the content of .javacpp directory, reference the native libs in `Build-NativeCode` and bundle them with the rest.
Maybe the
javacpp.cachedir
property should also be set to the configuration cache of the framework, but I'm not sure.
If javacpp would only unzip the jars we could also install the bundles in exploded format adding Eclipse-BundleShape: dir
to the Manifest for the Equinox-Framwork and corresponding headers for other OSGi implementations like Felix.
However for any approach it has to be considered that the path of a jars cached content within the cache is different when the URL schema is unknown to javacpp, at least at the moment.
When a known scheme is used (e.g. in a plain Java application build by maven) the path for cpython within the cache is for example:
cpython-3.9.2-1.5.5-windows-x86_64.jar\org\bytedeco\cpython\windows-x86_64
For an unknown schema this is different ( I can reproduce it if necessary).
@reckart your suggestion would be possible approach to solve this problem and make javacpp work out of the box for users. But of course it would require more work on the side of javacpp.
So we've got 3 options here:
1. Make OSGi "jrt compliant" by implementing a backend for `FileSystems` or whatever it is that we actually need to do, and use that in JavaCPP, which will also make it work with images from jlink and maybe even GraalVM in the future 2. Implement a hack like putting lists of resources in a "catalog" file or something 3. Implement many many container-specific hacks, such as with the resolver of OSGi
Options 3 implemented with reflection is probably the easiest thing to do to get something working with OSGi for now, so if that sounds OK to you @HannesWell please send a pull request with the updated code and I'll be happy to merge that!
Maybe I'm wrong but shouldn't a FileSystem for a container specific scheme be provided by that container/Framework? Otherwise again you would have to implement many different FileSystems for different containers.
I would be very happy to send a PR to solve this issue for now. But I'm not exactly sure how an (optional) dependency to a OSGi implementation should be avoided by using reflection? Do you mean the ClassLoader should search for Eclipses FileLocator class by reflection? I expect this to not work, because the javacpp bundle is not wired to the corresponding eclipse package and therefore probably will not find that class. Furthermore I would not like to introduce code that potentially does not work when strong encapsulation is enforced by the JDK. Even tough I'm not sure about that because I'm not sure about how strong the Eclipse bundles are 'modularized'. However I think reflection should be avoided in general if possible.
A change like the one I have prototyped here (https://github.com/HannesWell/javacpp/commit/634b24966e73395d4982cffac733230ec511c7c8) would not introduce any new API or any new runtime dependency and could be applied for other OSGi implementations as well, with a moderate amount of required changes. So if you agree with that, I can fine tune it and submit a PR.