JavaModule.assembly produces invalid or corrupt jar file (was: publishLocal as altenative to jars : which are corrupt often and at random)
I am getting a lot of corrupt jars, even though I have cleared/checked everything in sight several times. This happens to fat jars in sbt too, but there we have the alternative (for use in Jupyter, for example) of publishLocal. Can mill provide such an option?
Mill already supports .publishLocal
@siddhartha-gadgil So, what exactly is your issue? What is corrupt? And how did this happen?
Firstly, I have set up publishLocal and this is working fine for the case
I needed now (Jupyter Notebooks with Almond). In my work desktop, but not
my home laptop, the output of assembly was corrupted for some modules. This
meant that executable binaries would crash with "mainClass ... not found"
and if I loaded in ammonite using import $cp.myjar subsequent commands
would not find the contents. I found online that I should try jar tf, and
this indeed confirmed cooruption.
Presumably some source/doc jar of a dependency is corrupted, causing the upstream corruption. In fact it would be best for me to have a thinFat jar for running, excluding the sources and docs (especially of dependencies). Is that already an option?
On Sun, Jan 20, 2019 at 12:26 AM Tobias Roeser [email protected] wrote:
@siddhartha-gadgil https://github.com/siddhartha-gadgil So, what exactly is your issue? What is corrupt? And how did this happen?
— You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/lihaoyi/mill/issues/528#issuecomment-455805936, or mute the thread https://github.com/notifications/unsubscribe-auth/ADatpLqa_Pf9PZdbnjLCZOqsjaRbKg4hks5vE2ptgaJpZM4aHMcz .
To be honest, I have no clue what you wanna tell us. It would be helpful to be more concrete. E.g. the exact mill cmdline you used when mill crashed. The exact error message.If possible, the build.sc...
Looks like you should better ask on https://gitter.im/lihaoyi/mill.
I did not mean that mill crashes, but that the executable jar generated by mill myproject.assembly crashes, and in a random and hard to diagnose (at least for me) way. The reason is almost certainly some cached corrupt jar, and probably a source/documentation jar.
- My main request was to
publishLocal, as an alternative, which @lihaoyi pointed out is already part of mill. - A feature that would be nice is a minimal fat-jar, i.e. excluding source and documentation dependencies, so reducing the chance of corrupt jars fouling this up.
- Even better would be better diagnostics while building jars, though this may not be a mill issue.
@siddhartha-gadgil do you happen to be using Mill concurrently? Mill doesn't have proper concurrency control, so while concurrent usually works, if tasks happen to overlap in what they are doing things blow up in odd ways.
If you want diagnostics while building jars, feel free to build the jars yourself by copy-pasting the Mill code into your build.sc and add whatever diagnostics you'd like.
Quite possibly that is happening. But the main problem is probably a giant dependency on the Stanford Parser (coupled with not great internet) - random bits get corrupted in the cache, and any piece corrupted seems to make the jar unusable.
I'll try manually building by copy-pasting, with checks thrown in for corrupt jars.
@siddhartha-gadgil for internet-related issues with upstream dependencies, those issues should be persistent until you clean your coursier cache in ~/.coursier. If you are seeing things behave nondeterministically without clearing that cache, it is unlikely to be internet related.
Also, if you think things are being corrupted while building jars, use mill inspect and mill show to trawl the dependency graph of your task that created the corrupted jar and look at the input jars/folders to see if they contain what you expect. This might help you narrow down the corrupted jar to the actual culprit doing the corruption
It is not non-deterministic - I have a natural experiment because of two work systems, and I meant built on one and not on the other. I also meant changes when some dependencies change. I have tried trawling and resetting, but it is not easy manually at least on large scale.
For example, today I used publishLocal, got a crash in ammonite, deleting the corresponding coursier file and re-published to fix the error. But without the pointer from running (or some other script-based way) it would not be practical to find which dependency is corrupt.
This may actually be a mill issue, but I will get more data and report. I generated a list in mill of the upstreamAssemblyClasspath, and checked that "jar tf _" loaded successfully in each of them. However the same command on the output gives "java.util.zip.ZipException: invalid END header (bad central directory size)"
On Sun, Jan 20, 2019 at 5:20 PM Li Haoyi [email protected] wrote:
@siddhartha-gadgil https://github.com/siddhartha-gadgil internet-related issues with upstream dependencies, those issues should be persistent until you clean your coursier cache in ~/.coursier. If you are seeing things behave nondeterministically without clearing that cache, it is unlikely to be internet related.
Also, if you think things are being corrupted while building jars, use mill inspect and mill show to trawl the dependency graph of your task that created the corrupted jar and look at the input jars/folders to see if they contain what you expect. This might help you narrow down the corrupted jar to the actual culprit doing the corruption
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lihaoyi/mill/issues/528#issuecomment-455859644, or mute the thread https://github.com/notifications/unsubscribe-auth/ADatpLP5r-pGTEqtrDdDFH8vciuGvYGPks5vFFf_gaJpZM4aHMcz .
Sorry for the probably wrong report. Looks by searching that the sheer size may be the issue (over 65k files) because of the large dependency. I confirmed that unzip worked fine.
On Sun, Jan 20, 2019 at 7:14 PM Siddhartha Gadgil < [email protected]> wrote:
This may actually be a mill issue, but I will get more data and report. I generated a list in mill of the upstreamAssemblyClasspath, and checked that "jar tf _" loaded successfully in each of them. However the same command on the output gives "java.util.zip.ZipException: invalid END header (bad central directory size)"
On Sun, Jan 20, 2019 at 5:20 PM Li Haoyi [email protected] wrote:
@siddhartha-gadgil https://github.com/siddhartha-gadgil internet-related issues with upstream dependencies, those issues should be persistent until you clean your coursier cache in ~/.coursier. If you are seeing things behave nondeterministically without clearing that cache, it is unlikely to be internet related.
Also, if you think things are being corrupted while building jars, use mill inspect and mill show to trawl the dependency graph of your task that created the corrupted jar and look at the input jars/folders to see if they contain what you expect. This might help you narrow down the corrupted jar to the actual culprit doing the corruption
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lihaoyi/mill/issues/528#issuecomment-455859644, or mute the thread https://github.com/notifications/unsubscribe-auth/ADatpLP5r-pGTEqtrDdDFH8vciuGvYGPks5vFFf_gaJpZM4aHMcz .
FYI, it looks like the assembly target has an issue with very large assemblies and an non-empty prependShellScript. Here is a workaround:
override def prependShellScript: T[String] = ""
@lefou yes I have seen that misbehaviour before, when I was building some 500mb assemblies. I worked around it by disabling the prepend shell script
On Thu, 28 Jan 2021 at 6:44 PM, Tobias Roeser [email protected] wrote:
Reopened #528 https://github.com/lihaoyi/mill/issues/528.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/lihaoyi/mill/issues/528#event-4260399628, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE5HBDH2RO4O2EYDWHBFKR3S4E52FANCNFSM4GQ4Y4ZQ .
So, I think we should create a test case that reproduced this issue and then fix it. After a first glace at the code I think it could be, a proper fix will need to go to os-lib.
I recently fixed some issues with left open file handles in assembly processing (which constantly failed windows tests, #1327). There is a minimal chance this issue just vanishes after that fix, too. It would be nice, If someone could report if this issue is still present with mill >= 0.9.7-9-848292.
This is probably fixed?
Please reopen or comment if you find this issue is still valid!
This issue is still present. See #2650 for a reproduction.
why do i keep getting this error
./out/app/assembly.dest/out.jar
Error: Could not find or load main class MyApp
Caused by: java.lang.ClassNotFoundException: MyApp
why do i keep getting this error
You probably have a too large assembly and hit an issue with the JVM/JDK. Latest Mill snapshots and the upcoming version 0.11.8 will detect this and recommend a fix.
In the meantime, just use the workaround from https://github.com/com-lihaoyi/mill/issues/528#issuecomment-768965337.