mill icon indicating copy to clipboard operation
mill copied to clipboard

Analyze assembly error - Reproduction of #528 and #2650

Open lefou opened this issue 2 years ago • 9 comments

Reprodcution of issues

  • https://github.com/com-lihaoyi/mill/issues/528
  • https://github.com/com-lihaoyi/mill/issues/2650

lefou avatar Jul 05 '23 15:07 lefou

It looks to me, the main class can't be found if the jar reaches a specific size and the main class is located at the end of the jar.

I made some tests with my alpha-grade mill-spring-boot plugin, which also provided support to executable jars. The difference is, that not all jars are unpacked but included as is and a dedicated main is used. I added this failing test case and it can be started. My assumption is, that the specific spring-boot main class is located at the start of the JAR vs. in our case we append the main class at the end.

lefou avatar Jul 12 '23 08:07 lefou

One thing I've found is that even jar tf on the problematic prefixed assembly jar fails

java.util.zip.ZipException: invalid CEN header (bad signature)
	at java.base/java.util.zip.ZipFile$Source.zerror(ZipFile.java:1598)
	at java.base/java.util.zip.ZipFile$Source.checkAndAddEntry(ZipFile.java:1183)
	at java.base/java.util.zip.ZipFile$Source.initCEN(ZipFile.java:1537)
	at java.base/java.util.zip.ZipFile$Source.<init>(ZipFile.java:1315)
	at java.base/java.util.zip.ZipFile$Source.get(ZipFile.java:1277)
	at java.base/java.util.zip.ZipFile$CleanableResource.<init>(ZipFile.java:709)
	at java.base/java.util.zip.ZipFile.<init>(ZipFile.java:243)
	at java.base/java.util.zip.ZipFile.<init>(ZipFile.java:172)
	at java.base/java.util.zip.ZipFile.<init>(ZipFile.java:143)
	at jdk.jartool/sun.tools.jar.Main.list(Main.java:1502)
	at jdk.jartool/sun.tools.jar.Main.run(Main.java:361)
	at jdk.jartool/sun.tools.jar.Main.main(Main.java:1683)

but if I manually cut out the prependShellScript of the problematic assembly jar and create a new jar (as shown below) then jar tf works

@ val exeBytes = os.read.bytes(
      os.Path("/Users/lihaoyi/Github/mill/integration/feature/assembly/repo/out/exe/large/assembly.dest/out.jar")
    )

@ os.write(os.pwd / "newExe.jar", exeBytes.drop(209))

Also, unzip -l and unzip both work on both the problematic and non-problematic assembly jars. zipinfo works as well

Could it be some kind of limitation in the Java zip reading implementation? It seems unlikely, but given that unzip and zipinfo work, it seems the problematic behavior is Java-specific and not an inherent problem with the zip files involved

lihaoyi avatar Aug 02 '23 07:08 lihaoyi

The JDK java code in that stack trace is https://github.com/openjdk/jdk/blob/9454b2bbe130fdbe86485b928b80d19156c709ee/src/java.base/share/classes/java/util/zip/ZipFile.java#L1189-L1217

lihaoyi avatar Aug 02 '23 07:08 lihaoyi

Another data point: taking the problematic jar, unziping it and re-ziping it generates a non-problematic zip file, without a prepended shell script, that can be read without issue via jar tf, But cating it to append to it a launcher shell script again creates a final jar file that cannot be read by jar tf (same error as above) but can be read by zip.

This somewhat confirms that it isn't our implementation of create-a-zip-file that is problematic: a native unix zip-created archive, prepended with a shell script, fails to read via jar tf (which uses java.util.zip.ZipFile) but can be successfully read via unzip -l.

That means the problem has to be the interaction between the JVM zip reading code and the the prepended shell script, since the problem remains even after swapping out the zip writing code, but the problem goes away if we remove the prepended shell script or we use a non-JVM program to read the prepended jar file

lihaoyi avatar Aug 03 '23 03:08 lihaoyi

I made almost the same tests a while ago. I think it worth to note that it's not the sheer file size, that makes the JRE zip implementation fail, as an assembly jar built with mill-spring-boot and the exact same dependencies can be started. The difference is, that the spring boot assembly contains a lot less zip entries, as it is not unpacking all the dependencies, but rather just embeds the original jars. So, it's either the entry count (=catalog size) or the position (index) of the main class in that catalog. An interesting test would be to add the main class first (don't append to the upstream assembly jar). Though, I don't think it will fix it, from looking at the JDK code. Maybe, we can count the zip entries ourself and print a warning if we detect, that it will be too large to be executable. What I also wonder, we can't be the first who fell into that trap.

lefou avatar Aug 03 '23 07:08 lefou

Another data point, I tried this with SBT-1.8.2 and SBT-Assembly 2.2.1, and it has the same failure mode: jar tf fails if the assembly jar was created with assemblyPrependShellScript := Some(defaultUniversalScript(shebang = true)), and jar tf succeeds if the assembly jar was created without that config. So Mill is not the only tool where this mis-behavior appears

lihaoyi avatar Aug 07 '23 10:08 lihaoyi

Another data point is that zip (note: not unzip!) provides different errors when trying to load the two prepended jars, but does not give any errors trying to process the non-prepended jars

lihaoyi mill$ zip /Users/lihaoyi/Github/mill/integration/feature/assembly/repo/out/exe/large/assembly.dest/out.jar
	zip warning: Zip64 EOCDR not found where expected - compensating
	zip warning: (try -A to adjust offsets)
	zip warning: expected 75516 entries but found 75512

zip error: Zip file structure invalid (/Users/lihaoyi/Github/mill/integration/feature/assembly/repo/out/exe/large/assembly.dest/out.jar)


lihaoyi mill$ zip /Users/lihaoyi/Github/mill/integration/feature/assembly/repo/out/exe/small/assembly.dest/out.jar
	zip warning: unexpected signature on disk 0 at 10221122

	zip warning: archive not in correct format: /Users/lihaoyi/Github/mill/integration/feature/assembly/repo/out/exe/small/assembly.dest/out.jar
	zip warning: (try -F to attempt recovery)

zip error: Zip file structure invalid (/Users/lihaoyi/Github/mill/integration/feature/assembly/repo/out/exe/small/assembly.dest/out.jar)


lihaoyi mill$ zip /Users/lihaoyi/Github/mill/integration/feature/assembly/repo/out/noExe/large/assembly.dest/out.jar

zip error: Nothing to do! (/Users/lihaoyi/Github/mill/integration/feature/assembly/repo/out/noExe/large/assembly.dest/out.jar)


lihaoyi mill$ zip /Users/lihaoyi/Github/mill/integration/feature/assembly/repo/out/noExe/small/assembly.dest/out.jar

zip error: Nothing to do! (/Users/lihaoyi/Github/mill/integration/feature/assembly/repo/out/noExe/small/assembly.dest/out.jar)

lihaoyi avatar Aug 10 '23 01:08 lihaoyi

Final bit of debugging: it appears that the misbehavior of jar tf on a prepended jar starts when the number of entries in the jar file hits exactly 65535 zip entries.

A prepended jar with 65534 entries can be jar tfed without issue, while one with 65535 entries fails with the java.util.zip.ZipException: invalid CEN header (bad signature) error. The size of the entries in the jar file doesn't seem to matter. Zeroing them out does not change the number of entries necessary to hit the problem.

This was tested by applying the following diff

$ git diff
diff --git a/integration/feature/assembly/repo/build.sc b/integration/feature/assembly/repo/build.sc
index 43d09cd415..62bf22aff7 100644
--- a/integration/feature/assembly/repo/build.sc
+++ b/integration/feature/assembly/repo/build.sc
@@ -30,5 +30,7 @@ object noExe extends Module {

 object exe extends Module {
   object small extends Setup
-  object large extends Setup with ExtraDeps
+  object large extends Setup with ExtraDeps{
+    override def mainClass = Some("foo.bar.Main")
+  }
 }
diff --git a/scalalib/src/mill/scalalib/Assembly.scala b/scalalib/src/mill/scalalib/Assembly.scala
index cd25d18152..92c1a81448 100644
--- a/scalalib/src/mill/scalalib/Assembly.scala
+++ b/scalalib/src/mill/scalalib/Assembly.scala
@@ -223,9 +223,12 @@ object Assembly {
       )
       manifest.build.write(manifestOut)
       manifestOut.close()
-
+      var writeCount = 0
       val (mappings, resourceCleaner) = Assembly.loadShadedClasspath(inputPaths, assemblyRules)
       try {
+        // Good: 32000 48000 56000 60000 62000 62750 62762 62763 62764 62766
+        // Bad:  62767 62770  62775 63000 63016 63032 63064 63125 63250 63500 64000
+        val max =  62767
         Assembly.groupAssemblyEntries(mappings, assemblyRules).foreach {
           case (mapping, entry) =>
             val path = zipFs.getPath(mapping).toAbsolutePath
@@ -238,8 +241,16 @@ object Assembly {
                 val cleaned = if (Files.exists(path)) separated else separated.drop(1)
                 val concatenated =
                   new SequenceInputStream(Collections.enumeration(cleaned.asJava))
-                writeEntry(path, concatenated, append = true)
-              case entry: WriteOnceEntry => writeEntry(path, entry.inputStream(), append = false)
+
+                if (writeCount < max){
+                  writeCount += 1
+                  writeEntry(path, concatenated, append = true)
+                }
+              case entry: WriteOnceEntry =>
+                if (writeCount < max) {
+                  writeCount += 1
+                  writeEntry(path, entry.inputStream(), append = false)
+                }
             }
         }
       } finally {

And running the following commands to test the success of jar tf and count the number of entries

rm -rf integration/feature/assembly/repo/out && ./mill -i dev.run integration/feature/assembly/repo -i show exe.large.assembly && jar tf /Users/lihaoyi/Github/mill/integration/feature/assembly/repo/out/exe/large/assembly.dest/out.jar

ls -lh /Users/lihaoyi/Github/mill/integration/feature/assembly/repo/out/exe/large/assembly.dest/out.jar

It seems that there must be some change in the handling of zip files when the number of entries reaches 65535.

lihaoyi avatar Aug 10 '23 02:08 lihaoyi

Thanks for this great analysis, @lihaoyi. It would be nice, if we could evaluate the entry count and warn the user, if it is too large. We could continue, or fail or produce a jar without prepend script in case of too many entries. Don't know which option is the best though. All are unpredictable. Or we make the behavior configurable and default to fail, which feels like the best option to me.

lefou avatar Aug 14 '23 22:08 lefou

I opened PR https://github.com/com-lihaoyi/mill/pull/3140, which detects faulty assembly JARs, bases on Haoyi's findings and fails the build with an actionable error message.

lefou avatar May 03 '24 09:05 lefou

Merged as part of #3140.

lefou avatar May 06 '24 09:05 lefou