graal icon indicating copy to clipboard operation
graal copied to clipboard

[GR-52516] How does GraalVM perform EXE linking on Windows?

Open BullyWiiPlaza opened this issue 2 years ago • 10 comments

I noticed a bug caused by UPX when trying to compress GraalVM native EXEs: https://github.com/upx/upx/issues/670

Currently, the maintainer is wondering how native-image exactly performs linking of the native EXE for Windows and what "unusual" things native-image might do which might break tools like UPX. Does GraalVM use LINK.exe from Visual C++ or only the SubstrateVM for linking? What's fundamentally different from native EXEs compiled with the Visual C++ toolchain compared to what native-image does in that regard? Is there any way to enable debug logging of linker commands? --verbose and/or --diagnostics-mode doesn't seem to do it. "Regular" native EXEs (compiled with Visual C++) certainly don't cause issues with UPX compression.

BullyWiiPlaza avatar Oct 12 '23 20:10 BullyWiiPlaza

@BullyWiiPlaza Take a look for yourself. The code that links the image on Windows is defined in class WindowsCCLinkerInvocation (in package com.oracle.svm.hosted.image). Method getCommand() is responsible for constructing the link command. Obviously what gets into the command line depends on what you are building so reading that code will tell you definitively what options might arise.

The problem that is breaking UPX may very well not be down to a command option. The generated image includes various sections that are constructed from the Java code which are not going to look like the image sections generated by any Windows compiler. That doesn't mean they have an invalid format as far as Windows is concerned -- they certainly conform to the PECOFF standard -- but they may not be a format within that standard that UPX expects.

So, the 193 errors you are seeing probably reflect 193 cases where UPX does not expect or understand the binary layout GraalVM Native has chosen. What that implies as far as the program's correctness is concerned is probably nothing (although there may possibly be some layout bugs in there that don't break correct execution yet also fail conform 100% to PECOFF). What it implies as regards UPX's ability to compress the binary and what format it would be happy to consume instead is something that only the implementors of UPX can answer -- by inspecting the binary and seeing why it UPX does not like its layout.

n.b. The fact that UPX does not like the chosen format might be a reason for GraalVM Native to adopt a different one. I suspect that UPX is making some (simplifying) assumptions about the generated sections because it expects them to be generated by a Windows compiler. Unfortunately, GraalVM Native is pretty much guaranteed to produce image sections that don't look like that. So, even if GraalVM wanted to play ball with UPX it may not be possible to do so. Once again, this can only be confirmed by a UPX dev who undertstands how UPX operates explaining what is causing it to barf on the image layout.

adinn avatar Oct 13 '23 07:10 adinn

@adinn Many thanks for that info! UPX dev here

markus-oberhumer avatar Oct 13 '23 09:10 markus-oberhumer

@BullyWiiPlaza as for the linker invocation, you can print the exact command line using -H:+TraceNativeToolUsage

petermz avatar Oct 14 '23 02:10 petermz

Some detailed technical discussion about this UPX problem was in an old issue #4340, especially about the WindowsImageHeapProvider.class (which then got moved to the WindowsFeature.class). Some reason for the problem was mentioned in https://github.com/upx/upx/issues/559#issuecomment-1032047038, which came from a comment from the original PR https://github.com/oracle/graal/pull/4051#issuecomment-1036002461.

But I don't know whether those discussions are relevant anymore for the latest GraalVM version.

chirontt avatar Oct 16 '23 23:10 chirontt

I did some debugging with x64dbg under Wine, but nothing unusual - the UPX compressed program simply exits with exit code 193.

INT3 breakpoint at <ntdll.RtlExitUserProcess> ([00006FFFFFF839C0](x64dbg://localhost/address64#00006FFFFFF839C0))!
Process stopped with exit code 0xC1 (193)

Does this exit code ring a bell for somebody?

markus-oberhumer avatar Oct 25 '23 08:10 markus-oberhumer

Update: found suspicious ERROR_BAD_EXE_FORMAT constant in Git source code

$  rg -w -i 0xC1

substratevm/src/com.oracle.svm.core.windows/src/com/oracle/svm/core/windows/WindowsImageHeapProvider.java
180:    private static final int ERROR_BAD_EXE_FORMAT = 0xC1;

markus-oberhumer avatar Oct 25 '23 16:10 markus-oberhumer

Hello @BullyWiiPlaza, can you please confirm if the original issue (https://github.com/upx/upx/issues/559) is still happening after building a native image using the latest version of GraalVM?

fernando-valdez avatar Jan 12 '24 04:01 fernando-valdez

Hello @BullyWiiPlaza, can you please confirm if the original issue (upx/upx#559) is still happening after building a native image using the latest version of GraalVM?

Hello, unfortunately, after updating all software, the issue still happens.

BullyWiiPlaza avatar Jan 28 '24 19:01 BullyWiiPlaza

Hello @BullyWiiPlaza, can you please confirm if the original issue (upx/upx#559) is still happening after building a native image using the latest version of GraalVM?

@fernando-valdez I can confirm that this issue (previously reported as #4340) still exists with latest GraalVM for JDK 22 Community 22.0.0. The last version that worked out of the box was GraalVM Community Edition 21.3.3.1 (Java 17) from September 2022.

See here for a real world reproducer: https://github.com/vegardit/copycat/actions/runs/8426676147/job/23075534217#check-step-13

sebthom avatar Mar 25 '24 20:03 sebthom

Im still hit by this in GraalVm 23

nmwael avatar Oct 09 '24 05:10 nmwael

One year bump for status from @fernando-valdez or @graalvmbot team.

nmwael avatar Jan 13 '25 10:01 nmwael

We are not actively working on that issue at the moment. I don't expect we would have resources to investigate this deeper anytime soon.

If there is some concrete bug for us to fix, or an outside contribution that does that, we might look into it.

wirthi avatar Jan 17 '25 09:01 wirthi

We are not actively working on that issue at the moment. I don't expect we would have resources to investigate this deeper anytime soon.

If there is some concrete bug for us to fix, or an outside contribution that does that, we might look into it.

Aww, Would have loved to be able to show a size comparable RUST vs Java Native example. Right now its 10mb vs 3mb :( If it actually worked its 3 vs 4 mb.-.

nmwael avatar Jan 17 '25 18:01 nmwael