jni-rs icon indicating copy to clipboard operation
jni-rs copied to clipboard

Find and load the JVM library using java-locator and libloading

Open argv-minus-one opened this issue 3 years ago • 11 comments

Overview

This PR adds the feature invocation-dyn. When this feature is enabled, JavaVM::new will discover the location of the JVM shared library using the java-locator crate and load it using the libloading crate, instead of relying on the system dynamic linker.

That way, there's no need to add the location of the JVM shared library to $LD_LIBRARY_PATH or %Path%, making it much easier to run a program that uses the Invocation API. With the invocation-dyn feature enabled, this crate's tests all run successfully (on my Linux and Windows machines, at least) without having to run the test_profile script. (I don't have a Mac to test on at the moment.)

This PR also adds JavaVM::with_libjvm, which takes the location of the JVM shared library from the caller instead of using java-locator.

Breaking Changes

Because the java-locator and libloading calls can both fail, JavaVM::new now returns Result<Self, StartJvmError>. StartJvmError wraps java_locator::errors::Error and libloading::Error in addition to jni::errors::Error. Code that expects it to return Result<Self, jni::errors::Error> will have to be updated.

Question

Do you want to keep the original linking scheme? I'm not sure if there are some situations where it's better to link with the JVM shared library at build time (the original invocation feature) instead of discovering and loading it at run time (the invocation-dyn feature), so I kept both. On the other hand, removing the original linking scheme would simplify both the code (build.rs, test_profile) and the documentation of this crate.

Because I don't know if we're keeping both, I haven't yet fully documented the new feature. Please let me know which way you want to proceed, and I'll finish the documentation.

Definition of Done

  • [x] There are no TODOs left in the code
  • [x] Change is covered by automated tests
  • [x] The coding guidelines are followed
  • [ ] Public API has documentation
  • [ ] User-visible changes are mentioned in the Changelog
  • [ ] The continuous integration build passes

argv-minus-one avatar Jan 05 '21 12:01 argv-minus-one

Hi @argv-minus-one ,

I might be missing something, could you clarify please in which cases you need to dynamically link jvm instead of linking it at build time and loading at runtime? The latter still allows you to load any compatible jvm version (e.g., you can build with 8, but load 15 at runtime).

dmitry-timofeev avatar Jan 24 '21 10:01 dmitry-timofeev

My project (a JNI library meant to be loaded from Java code) contains tests that need to start up a JVM using the Invocation API. I want to be able to run cargo test without any special setup (like setting LD_LIBRARY_PATH or running a script like test_profile), just like any other Rust project.

argv-minus-one avatar Jan 25 '21 03:01 argv-minus-one

I see. I think we have to figure out the following things:

  1. The level of support — whether we support:
  • Only dynamic loading/static linking
  • Dynamic linking
  • Dynamic linking + discovery. I think dynamic linking is good to support in a general purpose library. @stanislav-tkach , Do you think it is reasonable to also add JVM installation discovery (here, through "java-locator" crate)? Any concerns on such a dependency?

There is also a question of consistency: there is already an implementation discovering jvm lib in build.rs — is it reasonable to include and document two different, with possibly different behaviour?

  1. API. Is it OK to have a method that changes its user-visible behaviour based on a feature (@stanislav-tkach)? I see it can be useful in the tests, where you can reconfigure the type of linking, but is it good enough to have only that? One concern is that the documentation of the method will have to include both things; and there is no option for a more apparent API, where you can see in the client code what it does without looking into Cargo.toml.

dmitry-timofeev avatar Feb 06 '21 16:02 dmitry-timofeev

I agree that it can be convenient to locate the JVM library automatically and I see no issues with the java-locator dependency.

Is it OK to have a method that changes its user-visible behaviour based on a feature (@stanislav-tkach)?

I'm not sure if there are any alternatives here.

stanislav-tkach avatar Feb 12 '21 19:02 stanislav-tkach

Is it OK to have a method that changes its user-visible behaviour based on a feature (@stanislav-tkach)?

Upon further reflection I think it would be better to have separate methods. Not only does it make it easier to document and understand API, but it is also rather common for Rust API to have a group of functions like new/new_with..., because there is no function overloading.

stanislav-tkach avatar Feb 21 '21 17:02 stanislav-tkach

@stanislav-tkach If we split it into two methods, then they cannot exist in the same build, since the basic dynamic linking (invocation) and discovery (invocation-dyn) mechanisms are mutually exclusive. I can think of two ways to do this:

  1. Mutually exclusive features.

    I would make the invocation and invocation-dyn features mutually exclusive. Attempting to activate both would result in a compiler error.

    This could be a problem if an application depends on the jni crate indirectly through two or more library crates. If crate A enables the invocation feature and crate B enables the invocation-dyn feature, then there's no way for crate C to depend on both A and B. They cannot be used together.

  2. invocation-dyn removes JavaVM::new.

    If the features are not mutually exclusive, then enabling the invocation-dyn feature would need to remove the JavaVM::new method.

    This would be very odd. One generally expects enabling a feature to only add APIs, not remove them. But this would avoid the problems with the other approach.

Both approaches have the additional problem that there would be no way to say “Just start a JVM. I don't care how it's loaded.” Code that starts a JVM would have to take a different code path if the invocation-dyn feature is enabled. That would make it arguably easier to understand what the code is doing in the simple case, but at the cost of making code more complicated if it needs to support both invocation features:

let vm = cfg_if! {
    if #[cfg(feature = "invocation-dyn")] {
        JavaVM::with_discovery(args)
    } else {
        JavaVM::new(args)
    }
}?;

Do you want me to split JavaVM::new anyway? If so, which approach should I use?

argv-minus-one avatar Feb 21 '21 23:02 argv-minus-one

This would be very odd. One generally expects enabling a feature to only add APIs, not remove them. But this would avoid the problems with the other approach.

Good point, though mutually exclusive features are also not that common in Rust and aren't straightforwardly supported.

Honestly, I don't have a strong opinion here. At first @dmitry-timofeev's questions convinced me that separate methods would be better, but now I'm not sure anymore. Still I'm not sure that "Just start a JVM somehow" approach is a good idea anywhere except tests.

stanislav-tkach avatar Mar 03 '21 21:03 stanislav-tkach

Apologies for bumping an old PR.

We've been using a fork of this with the PR merged to serve as a launcher for Java applications (ala launch4j or winrun4j), and thought I'd share an issue for others using this:

At least on windows, to correctly load jvm.dll the Java runtime's bin folder must be added to the DLL search path, otherwise it will fail to load. This is not evident on systems that have a Java installation on the path already. This requires a window's api call to SetDllDirectoryW/A before attempting to load the JVM.

Not sure if other OSes face this issue as well.

keastrid avatar Jun 06 '22 23:06 keastrid

@keastrid: It will also look at the JAVA_HOME environment variable (if it is non-empty). Alternatively, you can call JavaVM::with_libjvm to load jvm.dll from an arbitrary path.

If what you have is an arbitrary path to the root of a Java installation (not specifically to jvm.dll), then you could do what java-locator does and simply search that folder's subtree for a file named jvm.dll. If you're cross-platform, use java_locator::get_jvm_dyn_lib_file_name to determine the name of the file you're looking for (jvm.dll on Windows, libjvm.dylib on macOS, etc). Then, once you've found the correct path, pass it to JavaVM::with_libjvm.

argv-minus-one avatar Jun 09 '22 21:06 argv-minus-one

@argv-minus-one Yes, finding the jvm.dll is not an issue. Just passing the path to it to JavaVM::with_libjvm however is not sufficient to load the JVM as jvm.dll requires other dynamic libraries, in the bin folder, in order to load. If the system can locate those libraries from another location it is not an issue.

keastrid avatar Jun 09 '22 21:06 keastrid

@keastrid On Linux, finding libjvm.so seems to be enough. It somehow figures out where to find all the other shared libraries (libjava.so, libprefs.so, and so on) without needing them to be on the search path. Sounds like the Windows JVM can't do that, which is an unpleasant surprise.

This Stack Overflow answer implies that SetDllDirectory sets a per-process variable, rather than a per-thread, so it's not safe to call that unless you're certain that the process has only one thread. Your launcher can be certain of that, but this library can't.

What if, when you start the JVM, you pass an option -Djava.library.path=PATH\TO\JAVA_HOME\bin to the JVM using InitArgsBuilder::option? Does it work without SetDllDirectory then?

argv-minus-one avatar Jun 30 '22 05:06 argv-minus-one

I'm needing to dynamically choose a different JVM based on various conditionals in the program and it looks like this PR is precisely what I need. What's holding it back from inclusion so far so it can be resolved?

OvermindDL1 avatar Nov 16 '22 22:11 OvermindDL1

Hi @OvermindDL1; At least from my pov this was a change I wanted to look at more closely to understand how it affects (or doesn't affect) running on Android.

Unfortunately I still haven't had a chance to look at it more closely but that would have been why I didn't end up including it in the last 0.20 milestone / release.

If someone get's a chance to test/review this in the context of Android before I manage to look at this that might help expedite landing this, or a similar change.

Hope that helps explain why it's just sitting here for now.

There was a pretty huge backlog of issues and PRs before the last 0.20 release (since the project was dormant for a while) and so I was pretty picky about skipping over PRs like this where I figured it was going to take a bit more time / attention to review.

rib avatar Nov 18 '22 04:11 rib

Okey, trying to look in to this to see if we can get this landed...

I've added it to the 0.21 milestone too to hopefully not lose track of resolving this for the next release (it includes an API break so at least has to wait for a semver bump)

My initial instinct is:

  • Don't have separate entry points based on mutually exclusive features - just have JavaVM::new() that will conceptually always aim to "Just Work™"
  • I guess at the moment that it would be reasonable to just keep the dynamic implementation so we can keep things simpler - if something needs to force the use of a specific implementation they still have the option of using JavaVM::with_libjvm()
  • Although my initial concern was "How will this work on Android?" I think it's probably orthogonal since on Android the jni library isn't used to create the JavaVM anyway.

It could be good to hear more thoughts regarding the discussion started here but at least my initial impression is that the invocation-dyn feature doesn't in itself change the need to add the jvm bin/ path to the DLL search path on Windows (that's also the case with the existing invocation feature) so I suppose we don't need to treat this as a regression - it's maybe just unfortunate that things aren't quite as convenient on Windows. This issue can also potentially be addressed with further work separately.

rib avatar Nov 18 '22 15:11 rib

Just for reference; smoke testing this on Windows then cargo test did actually Just work for me even without adding my jdk's bin/ directory to the PATH explicitly (which I normally have to do to use the current invocation feature for testing)

If I list the full dependency chain for jvm.dll via https://github.com/lucasg/Dependencies then I see that all the various api-ms-win-crt-* dependencies that are available under the jvm/bin directory actually just get resolved via C:\Windows\system32\ucrtbase.dll instead

So in effect all the libraries under the jvm/bin directory are apparently redundant in my case (On Windows 10 with openjdk-19_windows-x64_bin)

Maybe it's not quite so convenient in other cases on Windows though.

rib avatar Nov 18 '22 16:11 rib

Okey, I've rebased this work and then made two follow up changes:

  1. I just kept the new dynamic loading in favor of the previous build-time linking. (including some corresponding doc updates, and this just keeps a single "invocation" feature)
  2. I remove the static global cache for the loaded library - which looked like it could potentially interfere with calling with_libjvm() more than once with different paths, and since it wasn't clear that the short cut / optimization was really necessary.

@argv-minus-one - sorry that it's taken a long time for there to be any movement with your PR but if you're still in a position to look a this it'd be great if you're able to take a look at the follow up comments / changes here and see if they look good to you?

rib avatar Nov 18 '22 17:11 rib

Just for reference; smoke testing this on Windows then cargo test did actually Just work for me even without adding my jdk's bin/ directory to the PATH explicitly (which I normally have to do to use the current invocation feature for testing)

If I list the full dependency chain for jvm.dll via https://github.com/lucasg/Dependencies then I see that all the various api-ms-win-crt-* dependencies that are available under the jvm/bin directory actually just get resolved via C:\Windows\system32\ucrtbase.dll instead

So in effect all the libraries under the jvm/bin directory are apparently redundant in my case (On Windows 10 with openjdk-19_windows-x64_bin)

Maybe it's not quite so convenient in other cases on Windows though.

I have not tried the other proposed solution to this, and currently don't have plans to do so.

It has been some time, so I don't remember the details of this very well. I could have misidentified the problem based on the solution.

This issue was tricky to test, and often showed for users as a JVM crash that pointed to failing to load the garbage collector, as on their path/java_home they had an older version of java installed that was missing the newer GC that was trying to be used. I think it was recreated once on a system without any java installed aside from the one being pointed to, but I cannot recall.

The folder structure when testing:

app folder
  /jre folder (contains jre to be run be the app)
    /bin
    /conf
    /lib
    release
  app.exe

These are the rules for standard DLL lookup. The directory from which the application loaded is probably the directory of the app.exe, which if differs from the folder of jvm.dll would cause the issue.

Looking at libloading's documentaion, assuming I'm not misunderstanding it, it might change the search order based on if you're passing a relative or absolute path. So I might have misunderstood libloading entirely and calling SetDLLDirectory might be unneeded, and just changing the path to be absolute/relative would work... If that is the case, that's some "gotcha."

keastrid avatar Nov 18 '22 20:11 keastrid

This issue was tricky to test, and often showed for users as a JVM crash that pointed to failing to load the garbage collector, as on their path/java_home they had an older version of java installed that was missing the newer GC that was trying to be used. I think it was recreated once on a system without any java installed aside from the one being pointed to, but I cannot recall.

Thanks for trying to recall the issue you saw; though I suppose it's tricky to infer quite what might have been happening at the time.

It's certainly possible to imagine situations where an application might inadvertently try and mix and match between a libjvm library that could be loaded by path that then ends up trying to load dependencies based on how the dynamic linker searches which might not necessarily find the dependencies that were properly validated with the original libjvm.

I would think that corner case situations along those lines would also be theoretically possible when the jvm library has been linked at build considering that the loading of dependencies should happen in much the same way (by the platform-specific runtime linker).

So in terms of evaluating whether to merge this change to make jni-rs load the jvm library dynamically at runtime vs linking at build time I don't expect that dynamic linking would make that kind of issue more likely. The risk of a discrepancy between the JAVA_HOME state and PATH or LD_LIBRARY_PATH state seems to be possible in both cases.

Looking at libloading's documentaion, assuming I'm not misunderstanding it, it might change the search order based on if you're passing a relative or absolute path. So I might have misunderstood libloading entirely and calling SetDLLDirectory might be unneeded, and just changing the path to be absolute/relative would work... If that is the case, that's some "gotcha."

I don't guess that libloading itself really does anything special for an absolute path. I think it's just worded that way because an absolute path should have consistent/predictable behavior across platforms (if the file exists it should be loaded otherwise it should fail and won't then follow a platform specific search). I think this documentation is more just highlighting that for all other cases (relative paths, or just the module name) then it will be very platform specific.

E.g. if you see the details about dll search order on Windows here they effectively note at the beginning that specifying a full path is a special case that will bypass any other searching:

"Applications can control the location from which a DLL is loaded by specifying a full path or using another mechanism such as a manifest. If these methods are not used, the system searches for the DLL at load time as described in this topic."

rib avatar Nov 18 '22 23:11 rib

I started merging PRs with API changes for 0.21 today and am hoping to land this PR too but I'll let it sit for a few days in case @argv-minus-one may get a chance to skim over the two follow up patches and ACK whether they seem reasonable - or possibly have comments to address.

rib avatar Nov 19 '22 00:11 rib

You might be right with documentation of libloading.

I do think the issue is something for jni-rs to handle, rather than leaving it to the user, as the errors you get are strange (seemingly random app users getting JVM crashes on startup), or at least documented in jni-rs.

In regards to the load issue already being present, if it is, it probably went unnoticed as jni-rs seems to be targeting a different usecase (call Rust), while I was using it for an app launcher (would also add that for GUI applications/launchers, the call to DetroyJavaVm is very important).

keastrid avatar Nov 19 '22 00:11 keastrid

@rib Looks good to me.

argv-minus-one avatar Nov 19 '22 01:11 argv-minus-one

You might be right with documentation of libloading.

I do think the issue is something for jni-rs to handle, rather than leaving it to the user, as the errors you get are strange (seemingly random app users getting JVM crashes on startup), or at least documented in jni-rs.

Yeah, if there's something concrete that jni-rs can do that helps ensure the right dependency libraries get loaded in more situations we can hopefully look at that.

To do something here though will probably also require a more specific issue that's reproducible so we can be sure we fully understand what problem needs to be solved.

e.g. I'd probably be cautious of having jni-rs making automatic changes to library search paths in case that actually conflicts with the environment that the user has intentionally configured but there could be room for having better default behaviour in some specific case.

My gut feeling at the moment is that the issue you were seeing with crashes wasn't caused by the use of dynamic/runtime linking vs build time linking so hopefully merging this PR isn't going to suddenly trigger lots of crashes for people.

It's hard to prove that beyond a doubt (without a detailed way of reproducing the original issue) but the dependency loading should be almost identical in both scenarios I think (in both cases the platform-specific linker is going to load dependencies automatically and will follow platform-specific searches for the right libraries)

In terms of documentation; the the JavaVM docs have been updated a little with this PR and they at least recommend that JAVA_HOME should be set and they note that it may be necessary to configure the platform specific search path for finding dependencies (e.g. setting PATH on Windows).

There's probably room to improve the docs further at some point though. If we had a more detailed understanding of the crash cases you hit that might also help determine how the docs could be improved it it turns out that the environment configuration is a key contributing factor.

In case you do happen to reproduce the kind of crash you got before, it'd be great if you'd be able to create an issue for that specifically with more details that can be investigated.

In regards to the load issue already being present, if it is, it probably went unnoticed as jni-rs seems to be targeting a different usecase (call Rust), while I was using it for an app launcher (would also add that for GUI applications/launchers, the call to DetroyJavaVm is very important).

It's possible you were hitting a less trodden path, yeah, though it's quite tricky to guess too.

jni-rs should hopefully cover a reasonable variety of cases though. E.g. I'm primarily using it on Android which is another broadly different case and that typically entails a mix of JNI from Rust -> Java from different threads as well as native Java -> Rust calls (but you also don't manually load/destroy the JavaVM on Android)

rib avatar Nov 19 '22 04:11 rib

Oh I didn't expect all this work rib, you are epic! Thanks much!

OvermindDL1 avatar Nov 29 '22 21:11 OvermindDL1