graal icon indicating copy to clipboard operation
graal copied to clipboard

Remove some limitations of GraalVM Native Image

Open christianwimmer opened this issue 3 years ago • 10 comments

GraalVM Native Image has a few limitations that are preventing users from building native images from their code bases. This issue describe the areas where improvements are planned over the next releases. Actual work is done under separate issues, linked from this issue so that everyone who wants to work on it gets an overview of who is working on what.

Support for the module system

The native image generator does not support a module path argument yet on JDK 11. The plan is to

  • support all module path arguments in the same way as the the java launcher, and
  • allow introspection of module information at run time using the respective JDK reflection APIs.

Only information about modules where at least one class is reachable will be available at run time.

Work to support the module path arguments is currently in progress: https://github.com/oracle/graal/issues/1962

Support for method handles and the invokedynamic bytecode

Currently only method handles that are a compile time constant are supported. This is an important subset because it allows, for example, all Java lambdas and the invokedynamic-based string concatenation. Still, we want to support method handles also when they are not a compile time constant, i.e., when method handle chains are constructed and changed at run time.

The approach will be similar to reflection: methods that are available via a method handle must be registered in a configuration file at image build time.

Note that the performance of such method handles will be slower compared to the Java HotSpot VM because no dynamic compilation of method handle chains is possible.

Work is currently in progress: https://github.com/oracle/graal/issues/2761

(Done) Support for Java serialization

Java serialization is a special form or reflection: during deserialization, classes are instantiated and fields are set in a reflective way without explicit bytecode. So the implementation approach will be similar to reflection: classes that are available for serialization must be registered in a configuration file at image build time.

(Update: support for serialization is finished and available starting with GraalVM 21.0)

Improve support for resource bundles and locales

Currently, the locale is fixed at image build time, i.e., registered resource bundles are only included for a single locale. This single locale is of course also the default locale at runtime. This will be made more flexible, so that arbitrary locales can be included and the default locale is set correctly at run time.

Work is currently in progress: https://github.com/oracle/graal/issues/2982

Improve support for resource registration

All resources that should be available at run time need to be listed at image build time. Currently, all listed resources are included in a non-structured form, i.e., it is not possible to navigate resources like a file system. Also resources are not separated by class loader, i.e., every class loader sees all resources.

The image heap must include a virtual file system for the included resources that allows navigation.

Allow incomplete class path and all security services by default

Currently, support for an incomplete class path must be enabled by manually via --allow-incomplete-classpath. This ensures that no linking errors are thrown at run time. But since in practice every large project needs to specify this option, we will flip the default and enable the support by default.

Similarly, support for all security services must be enabled manually via --enable-all-security-services. We will flip the default and enable support by default. (Update: this part is finished, the option --enable-all-security-services is deprecated and a no-op starting with GraalVM 21.1)

Allow multiple classes with the same name

Different class loaders can load multiple classes with the same name. This is currently not supported on native image. However, this is only a historic restriction that can be changed.

christianwimmer avatar Aug 18 '20 02:08 christianwimmer

Hello @christianwimmer, regarding this part:

Improve support for resource registration

... Also resources are not separated by class loader, i.e., every class loader sees all resources.

I am not an expert in native-image semantics, but I was under the impression that within an application running as a native image, there is always just one single classloader at runtime. Is that not the case?

jaikiran avatar Aug 19 '20 13:08 jaikiran

@jaikiran

I was under the impression that within an application running as a native image, there is always just one single classloader at runtime. Is that not the case?

No, that is not correct. Different ClassLoader that are present at image built time are preserved at image run time. So if you have custom class loaders at image build time (e.g. a URLClassLoader that loads classes that are not on the class path), then the class loader object still exists at run time.

However, right now each of these class loaders sees every class and every resource at run time, i.e., if you call loadClass or getResource on any class loader it also returns classes and resources that (at image build time) were provided by other class loaders. That is something that we need to change: at run time, each class loader object should only return classes and resources that it also could return at image build time. For that, the class and resource registry in the image heap needs to be structured by class loader.

christianwimmer avatar Aug 19 '20 15:08 christianwimmer

@christianwimmer, thank you very much for explaining those details.

jaikiran avatar Aug 21 '20 10:08 jaikiran

This PR removes the --enable-all-security-services option: https://github.com/oracle/graal/pull/3258. We now rely on static analysis to discover which security services are required.

cstancu avatar Mar 05 '21 19:03 cstancu

Hello,

please don't enable --allow-incomplete-classpath, it makes building large projects extremely difficult - especially if one wants to produce a reliable application (don't we all want that?).

Also, regarding "since in practice every large project needs to specify this option" : that's not true. All Quarkus applications are built without this option, as we explicitly prohibit using it.

Sanne avatar Mar 18 '21 15:03 Sanne

@Sanne I'm glad to hear that Quarkus does not need --allow-incomplete-classpath. I totally agree that it makes applications more robust.

From a "works out-of-the-box" point of view, it is bad though because many libraries have "dangling" dependencies that are just ignored on HotSpot but causes native image build errors. So the my current thinking is to make --allow-incomplete-classpath the default, and offer the reverse option so that a library can say "my package namespace can be linked at image build time". And Quarkus can then decide to set this option for all classes. We will certainly also use that approach when, e.g., building the native images for Truffle languages, because we never want to see any linking errors for those at run time.

christianwimmer avatar Mar 18 '21 18:03 christianwimmer

@christianwimmer that makes sense, thank you!

If we can simply revert to the current semantics I see no problem; we already do the same for e.g. not discovering all Services by default (only the ones explicitly chosen) and defaulting to build-time-initialization for most classes (with explicit exceptions). We really like the better optimisations that GraalVM can do in such configuration.

Sanne avatar Mar 18 '21 18:03 Sanne

Citing @christianwimmer from https://github.com/oracle/graal/issues/3491#issuecomment-865352456 :

We have the plan to make --allow-incomplete-classpath the default, and instead provide an option to enforce a "complete" classpath per library / package. [...] I hope to get that done for the 21.3 release.

Having a per lib/package control is a good thing, thanks a lot for it! I hope the implementation will bring library maintainers a way to define settings suitable for them while at the same they won't pollute all dependent projects who may prefer something different.

However, making --allow-incomplete-classpath default will make our lives in Camel Quarkus harder. Here is why:

Apache Camel is an integration toolkit that contains 300+ connectors for exchanging data between various heterogeneous systems. Those connectors are typically based on third party libs and frameworks. Camel Quarkus aims at porting all those 300+ connectors to Quarkus, where, as @Sanne pointed out above, there is a strong preference for doing things at build time, because it makes the end user applications more robust and also faster at runtime.

In Camel Quarkus will try our luck and enforce complete class for every single lib we depend on. It may work for some, while it will fail for others. Often the issues are easy to fix in third party libs and if we see a solution, we gladly report it and/or propose a fix.

Now you may ask what is wrong with all that? Well, I think if the most of the ecosystem will test with --allow-incomplete-classpath (because it is default), it will most probably be us finding the issues at a relatively late stage and bearing the burden of finding solutions for everyone's good.

If enforcing complete class path would stay the default, I think the load of finding and fixing issues would be spread more evenly across the ecosystem. Library maintainers would have higher chances to find issues early themselves when testing in native mode. Similarly for library consumers: if there is more of them testing with complete class path enforced, then there are higher chances that the issues will be found and fixed early.

In situations, where the incomplete class path is really necessary, because the library maintainer wants it like that, or because there is no other way, the announced per lib/package switch is highly welcome. But still, with complete class path enforced by default, there would be less chances that it gets overused.

ppalaga avatar Jun 22 '21 09:06 ppalaga

PR with a proposed solution to replace --allow-incomplete-classpath: https://github.com/oracle/graal/pull/4305

christianwimmer avatar Feb 28 '22 23:02 christianwimmer

#4753 Add support for including lambda classes to reflection configuration

Eng-Fouad avatar Jul 25 '22 09:07 Eng-Fouad

All items listed in this issue are finished by now.

christianwimmer avatar Sep 11 '22 03:09 christianwimmer

@christianwimmer did this partially solve #1108? It seems that the remaining concern there was about listing resources starting from the root.

bclozel avatar Sep 13 '22 13:09 bclozel