clj.native-image icon indicating copy to clipboard operation
clj.native-image copied to clipboard

Windows classpath length issue

Open taylorwood opened this issue 4 years ago • 20 comments

@sogaiu reported an issue on Windows where the classpath is hitting some length limit.

For Java 9 and up, it's possible to specify a classpath file (that contains the classpath argument) rather than passing the classpath itself as an argument. It might be possible to workaround the length issue with that.

taylorwood avatar Nov 27 '19 05:11 taylorwood

First came across mention of this feature at:

https://stackoverflow.com/a/54270831

where there is some explanation.

The following quote is from the official docs at:

https://docs.oracle.com/javase/9/tools/java.htm#JSWOR-GUID-4856361B-8BFD-4964-AE84-121F5F6CF111

In the command line, use the at sign (@) prefix to identify an argument file that contains java options and class names. When the java command encounters a file beginning with the at sign (@) , it expands the contents of that file into an argument list just as they would be specified on the command line.

So to use this functionality, it looks like an external file is necessary.

The official docs have a number of examples and the stackoverflow post has an example.

sogaiu avatar Nov 27 '19 05:11 sogaiu

Hmm, inspecting the output of the Java 11 graal 19.3.0's native-image, I don't see the @-file support listed:

$ ./native-image --help

GraalVM native-image building tool

This tool can be used to generate an image that contains ahead-of-time compiled Java code.

Usage: native-image [options] class [imagename] [options]
           (to build an image for a class)
   or  native-image [options] -jar jarfile [imagename] [options]
           (to build an image for a jar file)
where options include:
    -cp <class search path of directories and zip/jar files>
    -classpath <class search path of directories and zip/jar files>
    --class-path <class search path of directories and zip/jar files>
                          A : separated list of directories, JAR archives,
                          and ZIP archives to search for class files.
    -D<name>=<value>      set a system property
    -J<flag>              pass <flag> directly to the JVM running the image generator
    -O<level>             0 - no optimizations, 1 - basic optimizations (default).
    --verbose             enable verbose output
    --version             print product version and exit
    --help                print this help message
    --help-extra          print help on non-standard options

    --allow-incomplete-classpath
                          allow image building with an incomplete class path: report type
                          resolution errors at run time when they are accessed the first
                          time, instead of during image building
    --auto-fallback       build stand-alone image if possible
    --enable-all-security-services
                          add all security service classes to the generated image.
    --enable-http         enable http support in the generated image
    --enable-https        enable https support in the generated image
    --enable-url-protocols
                          list of comma separated URL protocols to enable.
    --features            a comma-separated list of fully qualified Feature implementation
                          classes
    --force-fallback      force building of fallback image
    --initialize-at-build-time
                          a comma-separated list of packages and classes (and implicitly all
                          of their superclasses) that are initialized during image
                          generation. An empty string designates all packages.
    --initialize-at-run-time
                          a comma-separated list of packages and classes (and implicitly all
                          of their subclasses) that must be initialized at runtime and not
                          during image building. An empty string is currently not
                          supported.
    --no-fallback         build stand-alone image or report failure
    --report-unsupported-elements-at-runtime
                          report usage of unsupported methods and fields at run time when
                          they are accessed the first time, instead of as an error during
                          image building
    --shared              build shared library
    --static              build statically linked executable (requires static libc and zlib)
    -da                   disable assertions in the generated image
    -ea                   enable assertions in the generated image

Available macro-options are:
    --language:nfi
    --language:js
    --language:regex
    --language:llvm
    --tool:coverage
    --tool:profiler
    --tool:chromeinspector
    --tool:agentscript

Whereas for java -help (for AdoptOpenJDK 11), I see:

    @argument files
                  one or more argument files containing options

May be there isn't any support for this in native-image...

sogaiu avatar Nov 27 '19 19:11 sogaiu

Did some brief testing and didn't get the sense that @-files are supported.

Have also asked at the graalvm slack's #native-image.

sogaiu avatar Nov 27 '19 19:11 sogaiu

Apparently, other approaches include:

  • Pathing jar: https://stackoverflow.com/a/201969
  • Classpath wildcards: https://stackoverflow.com/a/202034

Whether native-image has any support for these...

sogaiu avatar Nov 27 '19 20:11 sogaiu

Brief testing of the pathing jar approach suggests it might work.

For testing purposes, I tried lein uberjar in clj-kondo's project directory to get something to reference in the pathing jar's Manifest.mf.

Then created the following file / folder hierarchy:

META-INF/
└── Manifest.mf

Put the following in Manifest.mf:

Manifest-Version: 1.0
Class-Path: target/clj-kondo-2019.11.24-SNAPSHOT-standalone.jar
Main-Class: clj_kondo.main

Created pathing.jar by:

zip -r pathing.jar META-INF

Then invoked:

native-image -jar pathing.jar \
    "-H:Name=clj-kondo" \
    "-H:+ReportExceptionStackTraces" \
    "-J-Dclojure.spec.skip-macros=true" \
    "-J-Dclojure.compiler.direct-linking=true" \
    "-H:IncludeResources=clj_kondo/impl/cache/built_in/.*" \
    "-H:ReflectionConfigurationFiles=reflection.json" \
    "--initialize-at-build-time"  \
    "-H:Log=registerResource:" \
    "--verbose" \
    "--no-fallback" \
    "--no-server" \
    "-J-Xmx3g"

(Everything after the first line is just the ordinary stuff clj-kondo needs to be built, so I think we can ignore it for this discussion.)

This created a working clj-kondo.

On a side note, as if this isn't enough yak-shaving, there is apparently a line length limitation in manifest files...which apparently can be worked around by:

https://stackoverflow.com/a/3057862

For reference, other bits at the same and previous SO pages may contain even more caveats...

One caveat is that the paths listed as part of the value for Class-Path: need to be relative. May be that's not too bad.

Exactly how to format the value may also be an issue:

https://stackoverflow.com/a/33468204

sogaiu avatar Nov 27 '19 20:11 sogaiu

Perhaps the other approach (classpath wildcard) might not be too bad since Windows 10 has support for symlinks (I think).

May be a temporary directory could be made, filled with symlinks to each .jar item that is desired for the classpath and the non-jar items can be specified normally:

Class path entries can contain the basename wildcard character *, which is considered equivalent to specifying a list of all the files in the directory with the extension .jar or .JAR.

...

To match both classes and JAR files in a single directory foo, use either foo;foo/* or foo/*;foo.

via:

https://docs.oracle.com/javase/6/docs/technotes/tools/windows/classpath.html#Understanding

(The link actually points to the section immediately after the relevant one, so scrolling up might help.)

sogaiu avatar Nov 27 '19 21:11 sogaiu

One more approach that has turned up is to use the CLASSPATH environment variable.

That sounds the simplest, ~~but I don't know about the length limitations of this on Windows.~~

According to:

https://devblogs.microsoft.com/oldnewthing/20100203-00/?p=15083

All environment variables must live together in a single environment block, which itself has a limit of 32767 characters.

So it appears possible that depending on what other environment variables are set to, using the CLASSPATH environment variable could be an improvement.

sogaiu avatar Nov 27 '19 21:11 sogaiu

May be trying to use symlinks is worth considering.

I'm not exactly sure what the requirements are, but found this from some time back:

Starting with Windows 10 Insiders build 14972, symlinks can be created without needing to elevate the console as administrator.

and:

Now in Windows 10 Creators Update, a user (with admin rights) can first enable Developer Mode, and then any user on the machine can run the mklink command without elevating a command-line console.

via: https://blogs.windows.com/windowsdeveloper/2016/12/02/symlinks-windows-10/

May be "Developer Mode" needs to be enabled.

Even if that's necessary, may be it's not such a bad requirement, given that native-image is likely to be used mostly by developers?

sogaiu avatar Nov 28 '19 01:11 sogaiu

Below are some concrete numbers for the project I had difficulty with.

  • The project that has the problem has a classpath length of 8487 on a Linux box -- on Windows it looks like this is 8793 (based on the content of the .cp file in .cpcache).
  • There are 93 items in the classpath string, with 88 of them (.jars) in the local maven repository. The rest are directories.
  • The path to the local maven repository is 26 characters, so a bit under 2300 of the characters are just the maven repository path.

sogaiu avatar Nov 28 '19 15:11 sogaiu

I got confirmation from Paul Wögerer on #native-image that there isn't currently support for @-files in native-image.

sogaiu avatar Nov 28 '19 15:11 sogaiu

It could be that the clj / clojure tooling on Windows is experiencing this issue.

If it turns out there's a way for the classpath to get shortened either before clj / clojure receives it, or before clj / clojure launches its final jvm, may be clj.native-image will inherit a short enough classpath. Then, may be native-image will launch without problems.

sogaiu avatar Nov 28 '19 16:11 sogaiu

An attempt at a summary of the state of classpath-and-windows-length-limits-do-not-get-along-work-around-functionality available in native-image:

  • uberjar: build an uberjar and hand it to native-image using its -jar option. borkdude uses this method in clj-kondo, babashka, etc.
  • Pathing.jar: experiments suggest, yes: https://github.com/taylorwood/clj.native-image/issues/18#issuecomment-559240320
  • Classpath wildcards: apparently yes: https://github.com/oracle/graal/issues/1558#issuecomment-517587469
  • @-files: not supported according to Paul Wögerer on #native-image

So, how can one of the usable bits be applied?

The uberjar method has a track record. Where I've seen it, the uberjar is constructed via Leiningen. I would prefer a solution that doesn't use that. May be there is an alternate tool that could be used to a similar end. Not sure if depstar or uberdeps are enough -- more options may be available at: https://github.com/clojure/tools.deps.alpha/wiki/Tools#packaging

As mentioned in a comment above (https://github.com/taylorwood/clj.native-image/issues/18#issuecomment-559240320), the manifest file has some limitations to it. I don't know if that's something that can be surmounted. On the surface, it makes the pathing.jar approach not-so-attractive.

Due to symlinks being available on developer-mode-enabled Windows 10 machines, that in combination with the classpath wildcard functionality, seems promising. (This doesn't appear to be doable in Java 8 or Java 11 directly: https://bugs.java.com/bugdatabase/view_bug.do?bug_id=JDK-8218418 -- so shelling out to mklink might be a work-around, for code or a tool when running on Windows)

sogaiu avatar Dec 17 '19 02:12 sogaiu

To expand a bit on the "promising" idea, it might work like this:

  1. Use clj to determine a classpath
  2. Separate out the directories from the jar files in the classpath
  3. Create a new directory, say symholder, to hold symlinks to the jar files
  4. Create symlinks to the jar files within the new directory symholder
  5. Create a new classpath consisting of the directories from the classpath along with an entry for symholder/*
  6. Use the new classpath for both clj / clojure and native-image invocation

Notes:

  • The symlinks to the jar files need to be named in a way that they don't conflict. Just taking the names of the jar files themselves might occasionally lead to conflicts.

  • The order of things in classpath might change using this procedure. According to the "Setting the Class Path" doc:

The order in which the JAR files in a directory are enumerated in the expanded class path is not specified and may vary from platform to platform and even from moment to moment on the same machine. A well-constructed application should not depend upon any particular order. If a specific order is required, then the JAR files can be enumerated explicitly in the class path.

  • The symlinks to the jar files must be directly within symholder and not within a subdirectory of symholder. Apparently the classpath wildcard functionality is not recursive:

Subdirectories are not searched recursively. For example, mydir/* searches for JAR files only in mydir, not in mydir/subdir1, mydir/subdir2, and so on.

Reference: Setting the Class Path : Class Path Wild Cards

sogaiu avatar Dec 17 '19 02:12 sogaiu

Supposing there was code or a tool to accomplish the symlink + classpath wildcard idea, it might be used as follows:

  1. Use the tool / code to create a new classpath that makes use of the directory with symlinks describe in the previous comment
  2. clj.native-image's native-image-classpath (or a separate function) uses the new classpath to invoke native-image

Not sure yet how to invoke clj.native-image via an alias in a convenient way. That is, invokding clj -A:native-image seems like it will lead to a too-long-classpath being used.

Something like:

clj -Scp <new-shorter-classpath> -A:native-image

seems like it could work, but it's certainly not as nice. May be that's not worth worrying about.

sogaiu avatar Dec 17 '19 04:12 sogaiu

On Windows 10, I compared the .cp file created via clj -A:native-image, to a plain invocation of clj.

Oddly enough, the native-image classpath is quite a bit longer (~7500 bytes vs ~250 bytes).

I fabricated an invocation of native-image using the content of the .cp file for the plain invocation. The result was a successful build.

A similar fabricated invocation using the content of the .cp file from the clj -A:native-image run ended in failure:

...
The input line is too long.
The syntax of the command is incorrect.

I will try to compare the two classpaths.

sogaiu avatar Dec 17 '19 21:12 sogaiu

It looks like invoking clj -A:native-image for at least some projects leads to clj.native-image's dependencies (including transitive) being added to the .cp file for those projects.

Specifically, (something close to, if not) the following:

aopalliance/aopalliance1.0
com.amazonaws/aws-java-sdk-core1.11.184
com.amazonaws/aws-java-sdk-kms1.11.184
com.amazonaws/aws-java-sdk-s31.11.184
com.amazonaws/jmespath-java1.11.184
com.fasterxml.jackson.core/jackson-annotations2.5.0
com.fasterxml.jackson.core/jackson-core2.5.5
com.fasterxml.jackson.core/jackson-databind2.5.5
com.fasterxml.jackson.dataformat/jackson-dataformat-cbor2.6.7
com.google.guava/guava20.0
com.google.inject/guice$no_aop4.0
com.googlecode.javaewah/JavaEWAH1.1.6
com.jcraft/jsch.agentproxy.connector-factory0.0.9
com.jcraft/jsch.agentproxy.core0.0.9
com.jcraft/jsch.agentproxy.jsch0.0.9
com.jcraft/jsch.agentproxy.pageant0.0.9
com.jcraft/jsch.agentproxy.sshagent0.0.9
com.jcraft/jsch.agentproxy.usocket-jna0.0.9
com.jcraft/jsch.agentproxy.usocket-nc0.0.9
com.jcraft/jsch0.1.54
commons-codec/commons-codec1.10
commons-io/commons-io2.5
commons-logging/commons-logging1.1.3
javax.annotation/jsr250-api1.0
javax.enterprise/cdi-api1.0
javax.inject/javax.inject1
joda-time/joda-time2.8.1
net.java.dev.jna/jna-platform4.1.0
net.java.dev.jna/jna4.1.0
org.apache.commons/commons-lang33.5
org.apache.httpcomponents/httpclient4.5.4
org.apache.httpcomponents/httpcore4.4.8
org.apache.maven.resolver/maven-resolver-api1.1.1
org.apache.maven.resolver/maven-resolver-connector-basic1.1.1
org.apache.maven.resolver/maven-resolver-impl1.1.1
org.apache.maven.resolver/maven-resolver-spi1.1.1
org.apache.maven.resolver/maven-resolver-transport-file1.1.1
org.apache.maven.resolver/maven-resolver-transport-http1.1.1
org.apache.maven.resolver/maven-resolver-transport-wagon1.1.1
org.apache.maven.resolver/maven-resolver-util1.1.1
org.apache.maven.shared/maven-shared-utils3.1.0
org.apache.maven.wagon/wagon-provider-api3.0.0
org.apache.maven/maven-artifact3.5.2
org.apache.maven/maven-builder-support3.5.2
org.apache.maven/maven-core3.5.2
org.apache.maven/maven-model-builder3.5.2
org.apache.maven/maven-model3.5.2
org.apache.maven/maven-plugin-api3.5.2
org.apache.maven/maven-repository-metadata3.5.2
org.apache.maven/maven-resolver-provider3.5.2
org.apache.maven/maven-settings-builder3.5.2
org.apache.maven/maven-settings3.5.2
org.clojure/clojure1.10.1
org.clojure/core.specs.alpha0.2.44
org.clojure/data.codec0.1.0
org.clojure/data.xml0.2.0-alpha5
org.clojure/java.classpath0.3.0
org.clojure/spec.alpha0.2.176
org.clojure/tools.cli0.3.5
org.clojure/tools.deps.alpha0.7.549
org.clojure/tools.gitlibs0.2.64
org.clojure/tools.namespace0.3.1
org.clojure/tools.reader1.3.2
org.codehaus.plexus/plexus-classworlds2.5.2
org.codehaus.plexus/plexus-component-annotations1.7.1
org.codehaus.plexus/plexus-interpolation1.24
org.codehaus.plexus/plexus-utils3.1.0
org.eclipse.jgit/org.eclipse.jgit4.10.0.201712302008-r
org.eclipse.sisu/org.eclipse.sisu.inject0.3.3
org.eclipse.sisu/org.eclipse.sisu.plexus0.3.3
org.slf4j/jcl-over-slf4j1.7.25
org.slf4j/slf4j-api1.7.25
org.sonatype.plexus/plexus-cipher1.4
org.sonatype.plexus/plexus-sec-dispatcher1.4
org.springframework.build/aws-maven5.0.0.RELEASE
s3-wagon-private/s3-wagon-private1.3.1
software.amazon.ion/ion-java1.0.2

(Obtained via clj -Stree in clj.native-image's project directory and then compared with the .cp file content)

sogaiu avatar Dec 17 '19 21:12 sogaiu

@sogaiu I'm having the same issue and I'm currently stuck. I cannot use Leiningen because it doesn't work with cljfx and uberjar and I cannot use deps.edn because of this issue. I haven't had any issues with Leiningen and lein-native-image plugin. Did you figure out how to remove those transitive dependencies?

mhavrlent avatar Mar 31 '20 21:03 mhavrlent

I don't think a solution was arrived at that made it into clj.native-image.

IIRC, the compilation being done in the same process as clj.native-image running can lead to this situation. It's been a while but I think it had to do with overlaps between clj.native-image's dependencies and those of one's project: https://github.com/taylorwood/clj.native-image/issues/20

As mentioned in #20, I worked around this once by using an older version of clj.native-image. Whether that sort of thing works might be project-specific though.

Not sure if it would help for your case, but one thing that came about from these investigations was this: https://github.com/sogaiu/clj-pathing-jar

Sorry if this is a bit fragmented -- don't have the context loaded up :sweat_smile:

sogaiu avatar Mar 31 '20 22:03 sogaiu

Hello!

I have just faced the same issue and it seems that graal now supports arguments file. https://github.com/oracle/graal/pull/2443

Will it help with resolving the issue?

ir4y avatar May 05 '21 10:05 ir4y

So I ran into this issue, and solved it by writing the classpath argument to a file as mentioned above https://github.com/skynet-gh/skylobby/blob/53a56f8/dev/clj/graal/native_image.clj#L20-L27

skynet-gh avatar Oct 22 '21 04:10 skynet-gh