graal icon indicating copy to clipboard operation
graal copied to clipboard

Remove scanning and loading of all classes at beginning of image build

Open christianwimmer opened this issue 5 years ago • 4 comments

TL;DR

The image generator should not scan the complete class path and load all classes on the class path. This slows down image builds.

Details

The first step of building a native image is a complete class path scan: all classes of the provided application class path are loaded in the hosting Java HotSpot VM. This was an early design decision in Native Image that allows the image builder to

  • query all subtypes of a provided type (including all implementations of an interface), and
  • query all classes / methods / fields that have a certain annotation.

But the approach is problematic for several reasons:

  • Loading all classes is time consuming. For larger applications and benchmarks, 30 seconds are not uncommon.
  • If no class path is provided explicitly, the current working directory is the class path. If the current working directory is the user's home directory (or even the file system root as it is common in Docker), large parts of the file system are scanned and it appears as if the native image tool is hanging.
  • Class loading can already throw errors when the class path is incomplete or classes do not verify. Such classes are silently ignored during classpath scanning, which can lead to surprisingly missing classes in the queries.
  • The JDK itself is not scanned. Additional class loader used by the application are not scanned. So the list of subtypes and annotated elements is never a complete and reliable "closed world view".

Access to the list of all classes and annotated elements is not possible via the supported API in org.graalvm.nativeimage. However, we are aware that several frameworks that support Native Image access the information by reaching into implementation-specific data structures. These frameworks need to adapt before the code can be removed from native image.

Alternatives:

  • Instead of scanning all elements, scan only the elements reachable already by the static analysis. The static analysis has a convenient API to register hooks and listeners.
  • Move the scanning code into the application and framework, using a library of your choice or implement the scanning yourself. For this, we need to provide the class path of the application via the native image API (see https://github.com/oracle/graal/issues/2600)

Internally, several parts of native image currently rely on the class path information. This list is not exhaustive, but only a starting point of things that need to change:

  • The low-level C interface provided in org.graalvm.nativeimage.c uses annotations. Annotated elements are processed before the static analysis to query values from the imported C libraries. Alternatives: Do these queries as a separate step before image building (which also simplifies cross compilation).
  • Find all entry points from C code (annotated with @CEntryPoint)
  • Find all @TargetClass annotated classes that have substitutions. Alternative: use service loader to find substitution classes.
  • Find subclasses when registering classes for initialization at build time / run time in ConfigurableClassInitialization
  • Support code for JavaFX and Scala that ships as part of Native Image
  • Registration of graph build plugins for node intrinsics

christianwimmer avatar Jun 22 '20 19:06 christianwimmer

Hi Chistian,

This sounds like an 'interesting' change -- but also one well worth investigating :-).

I think the idea of providing hooks for consumers of native image to direct the class graph search sounds very promising. Red Hat are keen to help with implementing changes along these lines.

adinn avatar Jun 23 '20 09:06 adinn

what about this @TargetClass and @Substitute?

Without scanning GraalVM can never see this classes.

@TargetClass(value = Png.class)
public final class PngReplacement {
  @Substitute long encodeRGBA(...); 
}

kirillp avatar Nov 24 '20 14:11 kirillp

what about this @TargetClass and @Substitute?

The easiest way to handle these things will be annotation processing when the source code is compiled, i.e., let javac already collect the list of annotated classes.

christianwimmer avatar Nov 24 '20 19:11 christianwimmer

This is still on the 21.1 milestone, which milestone should it be moved to?

gilles-duboscq avatar Jun 15 '21 14:06 gilles-duboscq