corert icon indicating copy to clipboard operation
corert copied to clipboard

Remove dependency on link.exe and Windows SDK

Open MichalStrehovsky opened this issue 5 years ago • 31 comments
trafficstars

Link.exe is not available as a standalone tool and gets bundled with the Windows SDK/DDK that is a huge download.

Investigate whether we can bundle LLD.

  • [ ] make sure it can embed NatVis files when targeting Windows
  • [ ] make sure it can generate all the debug records we emit
  • [ ] make sure it supports SourceLink (I have a WIP SourceLink support in a branch that I can never get to - https://github.com/MichalStrehovsky/corert/commit/84e53e3eec9ff24e666eb55792de95eb4b4e6585)
  • [ ] make sure it can generate import libraries out of DEF files a la link.exe /lib /def:foo.def /machine:x64 /out:foo.lib (this is our way out of the "what kind of import libraries to specify" hell + gets rid of the Windows SDK dependency)

This is also a stepping stone to enable cross-compilation (e.g. target Windows from Linux and Linux from Windows).

MichalStrehovsky avatar Jan 05 '20 08:01 MichalStrehovsky

@MichalStrehovsky Ive been wanting to do this for a while, but I think I finally figured it out. Look at this:

#!/bin/sh
curl -O https://download.visualstudio.microsoft.com/download/pr\
/3f2fc602-8afd-4687-a62a-80be4d6767f6\
/c9a22f5e1884344e9a2c7874e9feb473fd2faecb1e1fe71e8964c8d4c62658e5\
/Microsoft.VisualCpp.Tools.HostX64.TargetX64.vsix
unzip -d Tools Microsoft.VisualCpp.Tools.HostX64.TargetX64.vsix
fd 'exe$'

Result:

Tools\Contents\VC\Tools\MSVC\14.24.28314\bin\Hostx64\x64\bscmake.exe
Tools\Contents\VC\Tools\MSVC\14.24.28314\bin\Hostx64\x64\cl.exe
Tools\Contents\VC\Tools\MSVC\14.24.28314\bin\Hostx64\x64\cvtres.exe
Tools\Contents\VC\Tools\MSVC\14.24.28314\bin\Hostx64\x64\dumpbin.exe
Tools\Contents\VC\Tools\MSVC\14.24.28314\bin\Hostx64\x64\editbin.exe
Tools\Contents\VC\Tools\MSVC\14.24.28314\bin\Hostx64\x64\lib.exe
Tools\Contents\VC\Tools\MSVC\14.24.28314\bin\Hostx64\x64\link.exe
Tools\Contents\VC\Tools\MSVC\14.24.28314\bin\Hostx64\x64\ml64.exe
Tools\Contents\VC\Tools\MSVC\14.24.28314\bin\Hostx64\x64\mspdbcmf.exe
Tools\Contents\VC\Tools\MSVC\14.24.28314\bin\Hostx64\x64\mspdbsrv.exe
Tools\Contents\VC\Tools\MSVC\14.24.28314\bin\Hostx64\x64\nmake.exe
Tools\Contents\VC\Tools\MSVC\14.24.28314\bin\Hostx64\x64\undname.exe
Tools\Contents\VC\Tools\MSVC\14.24.28314\bin\Hostx64\x64\vcperf.exe
Tools\Contents\VC\Tools\MSVC\14.24.28314\bin\Hostx64\x64\vctip.exe
Tools\Contents\VC\Tools\MSVC\14.24.28314\bin\Hostx64\x64\xdcmake.exe

ghost avatar Jan 05 '20 13:01 ghost

We also need .lib files from Windows SDK to link.

jkotas avatar Jan 05 '20 14:01 jkotas

We also need .lib files from Windows SDK to link.

Ive been busy with other things, but I wanted to mention that I think I have an answer for this as well. I believe the package in question would be Win10SDK_10.0.18362 or similar and the installer in question would be Windows SDK for Windows Store Apps Libs-x86_en-us.msi or similar. Here is some output:

Windows Kits\10\Lib\10.0.18362.0\um\x64\icuin.Lib
Windows Kits\10\Lib\10.0.18362.0\um\x64\icuuc.lib
Windows Kits\10\Lib\10.0.18362.0\um\x64\inkobjcore.lib
Windows Kits\10\Lib\10.0.18362.0\um\x64\iphlpapi.lib
Windows Kits\10\Lib\10.0.18362.0\um\x64\kernel32.Lib
Windows Kits\10\Lib\10.0.18362.0\um\x64\kernel32legacylib.lib
Windows Kits\10\Lib\10.0.18362.0\um\x64\mbnapi.tlb
Windows Kits\10\Lib\10.0.18362.0\um\x64\mbnapi_uuid.lib
Windows Kits\10\Lib\10.0.18362.0\um\x64\mfreadwrite.lib
Windows Kits\10\Lib\10.0.18362.0\um\x64\mfsensorgroup.lib

https://github.com/cup/sunday/tree/master/visual-studio

ghost avatar Jan 12 '20 00:01 ghost

One problem with grabbing these from various VSIX and MSI files is that there's no good way to automatically pull those files in. We either need some external prerequisite script that does this download (similar to installing VS, except maybe we need a bit less stuff), or the Microsoft.DotNet.ILCompiler package can do it automatically but then it gets into the business of managing external state outside NuGet (which is not a good place to be in).

I leaned towards trying things with LLD because that might be easier to just pack into a NuGet package. We would still be in the business of servicing LLD but maybe it's less work than link.exe.

MichalStrehovsky avatar Jan 13 '20 09:01 MichalStrehovsky

Did you look at https://github.com/mstorsjo/llvm-mingw ? It's all self-contained and multiplatform too.

woachk avatar Jan 25 '20 10:01 woachk

I’d like to try addressing this issue. Can you please give some pointers how to setup my machine for developing/testing Ilc? For example, how to setup test projects so that I can use the Ilc I am building and not the one from NuGet.

teobugslayer avatar Apr 24 '20 13:04 teobugslayer

Thanks!

I think before starting anything, we need to have clarity on the lifecycle of the external bits - what's the source of the linker bits, how is the lifetime of it going to be managed, etc.

The existing situation, while not ideal, ensures that the linker and SDK is serviced (if there's e.g. a security issue, this is handled by the VS installer), and that it can be properly uninstalled. If we were to e.g. download linker from a URL on the first launch, we need to store the bits somewhere and suddenly we're in the business of managing the lifetime. Plus what if the first launch happens without network, or there's a network error - normally the network after "nuget restore" is no longer needed.

I don't have a good proposal/solution for this, unfortunately.

MichalStrehovsky avatar Apr 24 '20 14:04 MichalStrehovsky

Good points.

First, installation of LLD must happen when Ilc is being installed. This makes sense, because this is the time when the user expects changes on their machine and we can reasonably expect to have working internet connection, because Ilc itself must be downloaded.

With my limited knowledge, I could think of the following solutions:

  1. Distribute it as part of the Ilc NuGet package. LLD binaries are comitted in the CoreRT repo. Pros: lifetime and distribution are managed for us. Cons: Size - we add ~70MB to each and every ILC build. LLD is Apache 2 licensed so we are allowed to redistribute, however I am not sure if there are legal obstacles from Microsoft side to include external bits in the sources and the build.

  2. Package LLD as a NuGet package and depend on it. Pros: Size price is paid only when LLD releases new version, which is relatively rare event (once per month or even rarer). Lifetime and distribution are again managed for us by NuGet. Cons: I have no idea how Ilc would find where LLD is on the disk to invoke it. Also, where would the NuGet package be hosted? I do not know NuGet enough to know if I can submit a non-dotnet package.

  3. Keep the LLD bits somewhere and download them manually Pros: I do not see any Cons: Essentially we need to write a distribution system, which is clearly outside of the scope of writing a compiler. I do not know where to keep the binaries (the size means traffic, and traffic is $$$).

I am most in favor of option 2.

teobugslayer avatar Apr 24 '20 17:04 teobugslayer

It might be worth investigating if having an LLD build without LTO support (which will not be needed and will be nonfunctional for CoreRT anyway) is feasible. That'd more than significantly decrease the LLD size.

woachk avatar Apr 24 '20 17:04 woachk

By the way, I did a very dirty hack - I renamed lld-link.exe to link.exe, then renamed MS's link.exe to ms-link, checked with where link that indeed only my bad boy is on the path, ran dotnet publish and voila - got my Avalonia demo compiled and fully functional.

Edit: I tried debugging the produced exe in Visual Studio, it worked. Also, according to this commit, LLD supports natvis.

teobugslayer avatar Apr 24 '20 19:04 teobugslayer

LLD is indeed production-ready for x86_32/x86_64/arm/arm64 on Windows at this point, and is pretty much a seamless replacement.

Kinda wonder if a CoreRT specific linker written in C# to avoid the redistribution issues would be worth it tho.

woachk avatar Apr 24 '20 19:04 woachk

I think the starting point for this should a independent nuget package with LDD linker and allow CoreRT to use the linker from that package. One would need to specify the LDD linker nuget package as an additional reference, CoreRT package would not depend on it directly.

This setup would allow people who are on Visual C++ and happy with it to keep using that without paying for LDD, and people who would rather use LDD to use that instead.

jkotas avatar Apr 24 '20 22:04 jkotas

@jkotas, I agree this is a practical approach. I did non-scientific measurements, and these are the performance results of the two linkers:

ms-link User Time: 8.7656250 seconds Kernel Time: 1.9687500 seconds Private Bytes: 612 765 696 Peak Private Bytes: 4 186 517 504 Working Set: 635 289 600 Peak Working Set: 4 983 738 368

LLD User Time: 6.4375000 seconds Kernel Time: 1.7812500 seconds Private Bytes: 315 346 944 Peak Private Bytes: 1 643 630 592 Working Set: 1 532 616 704 Peak Working Set: 4 029 952 000

Although raw numbers does not represent the feelings, lld feels so much faster and snappier, that I do not want to use ms-link anymore at all. So for me, the idea for using a separate NuGet dependency would be just a step towards making lld the default one.

That said, let's go back to writing code. I created a simple NuGet package which exports two msbuild properties: IlcLinker and CppLinker: https://github.com/teobugslayer/corert-lld

Then, I hacked Ilc's Microsoft.NETCore.Native.Windows.props with these modifications:

<CppLinker Condition="'CppLinker' == ''">link</CppLinker>
<!-- later to PropertyGroup Condition="'$(_VCVarsAllFound)' == '0'" -->
<!--<CppLinker>"$(_CppToolsDirectory)link.exe"</CppLinker>-->

This picked up my linker and successfully built my test project lith LLD from my nuget package.

My knowledge about hacking msbuild stops here. I cannot suggest the proper way for Ilc to pick up the externally defined msbuild property. Your suggestions are welcome.

teobugslayer avatar Apr 25 '20 04:04 teobugslayer

I actually found a bug in Ilc while testing my setup :)

objwriter.dll depends on msvcp140.dll and vcruntime140.dll, which are not included. @MichalStrehovsky could you please fix this?

teobugslayer avatar Apr 25 '20 10:04 teobugslayer

@teobugslayer Is requiring the Visual C++ redistributable a bug even? A lot of apps require it on Windows anyway. (that said recompiling that with /MT instead of /MD is possible)

woachk avatar Apr 25 '20 10:04 woachk

@woachk I haven't stated what the fix would be. Changing of system requirements is a valid fix.

However, given the fact that Ilc already brings ucrtbase.dll and ms-api-* friends, I assume that creating a self-contained distribution was a goal for the project. I think this is better.

teobugslayer avatar Apr 25 '20 10:04 teobugslayer

I found another problem - Ilc requires libcmt.lib, libcpmt.lib, and oldnames.lib, which come only with the VC compiler toolset. I am not sure if we can work-around these.

teobugslayer avatar Apr 25 '20 11:04 teobugslayer

To summarize my thoughts on the issue. What work is needed to make building projects with CoreRT dependant only on .Net Core SDK and no other pre-installed packages?

  • [ ] dependency on Windows SDK for import libraries such as kernel32.lib, and the CRT Can be solved by generating manually lib files from existing installation of Windows and distributing these as a separate NuGet package

  • [ ] dependency on Microsoft Visual C++ redist Can be solved by distributing the required msvcp* files in the Ilc NuGet package

  • [ ] dependency on static version of CRT Cannot be legally solved, as long as we require the MS toolchain

  • [ ] dependency on the Microsoft build toolchain, esp. the linker Can be solved by distributing the LLVM linker LLD in a separate NuGet package

Given that currently we cannot avoid using the static CRT, i think the last point is only marginably important - LLD reduces build times, but this is a nice-to-have feature. However, removing the need for the Windows SDK is very beneficial, especially for people with small SSDs, like me.

Thoughts? Should I persue this further?

teobugslayer avatar Apr 29 '20 06:04 teobugslayer

However, removing the need for the Windows SDK is very beneficial, especially for people with small SSDs, like me.

It also avoids the problem with forgetting to specify LIB files for APIs that are hard bound. I've been wondering whether it would make sense to add a compiler option to dump all hardbound p/invokes into a file (could be as simple as a flat list of module names and procedure names), and then add an MSBuild task that runs before the linker that goes over this file and generates DEF files for the individual libraries. The add some MSBuild to run link /lib /machine:x64 /def:foo.def /out:foo.lib to generate a LIB file out of each.

The format of the DEF files that link.exe accepts for this is:

LIBRARY foo
EXPORTS
Bar
Baz

We already have some MSBuild tasks that support the compiler so the building blocks are there. @jkotas what do you think?

MichalStrehovsky avatar Apr 29 '20 11:04 MichalStrehovsky

I like this idea.

jkotas avatar Apr 29 '20 14:04 jkotas

As for the format of the file listing all the pinvokes, we can just use the same format that IL Linker produces (mono/linker#992). I don't have a specific reason for why, but we need a format and one was already invented. I wouldn't bother with the full name and assembly field for now, so just:

[
    {
        "entryPoint": "CustomEntryPoint",
        "moduleName": "lib_copyassembly"
    },
    {
        "entryPoint": "FooEntryPoint",
        "moduleName": "lib_copyassembly"
    }
]

@teobugslayer if you're interested in implementing this, I can try to give you some pointers.

MichalStrehovsky avatar May 01 '20 12:05 MichalStrehovsky

Sure. Nothing you said makes sense to me, so let's see how far we will go.

teobugslayer avatar May 01 '20 13:05 teobugslayer

We are also going to need the right symbols in the .lib to satisfy the dependencies of the unmanaged portion of the runtime, and maybe CRT too. I am wondering whether it would be better to start by having a checked in file that has all kernel32, etc. exports from Windows 7. We can use such list to generate the .lib, but also to do more aggressive hard-binding to PInvokes (ie use it for PInvoke configuration).

jkotas avatar May 01 '20 15:05 jkotas

As for the format of the file listing all the pinvokes, we can just use the same format that IL Linker produces

We ultimately need the .def file that Windows linker understands. What's the advantage of generating this intermediate format and have another step that takes the intermediate format to generate what we need vs. just generating what we need directly?

jkotas avatar May 01 '20 15:05 jkotas

We are also going to need the right symbols in the .lib to satisfy the dependencies of the unmanaged portion of the runtime, and maybe CRT too.

Mmmm, yeah, that's a bit annoying. Maybe having the compiler consume DEF files would be a better approach indeed because as you said - it can also be used to implement #2454 (which is trying to figure out a way to control whether a DllImport should directly expect an external symbol to be provided at link time, or whether we should do LoadLibrary/GetProcAddress (dlopen/dlsym) at runtime to resolve the import). The only drawback of using this to implement #2454 is that we would be using a Windows-style file format outside Windows too, but maybe that's fine.

What's the advantage of generating this intermediate format and have another step that takes the intermediate format to generate what we need vs. just generating what we need directly?

It's annoying to have a tool invocation within MSBuild that produces multiple outputs. And it felt like having a task do that would be more convenient. But it's irrelevant if we go with the "consume DEF file" approach.

MichalStrehovsky avatar May 01 '20 17:05 MichalStrehovsky

I continued my tests, and it turns out that we need Windows 10 SDK because it provides the static CRT - libucrt.lib.

I still added a set of manually generated import LIB files into my repo.

I still do not understand what @MichalStrehovsky and @jkotas are talking about, but at least I proved that the initial issue (remove dependency on WIndows 10 SDK) is unfeasible and cannot be (legally) solved.

Of course, IANAL, and the terms on https://docs.microsoft.com/en-us/legal/windows-sdk/redist may actually allow us to redist these files.

Funny side note: As parts of my experiments, I tried using Windows 2003 SDK. It spectacularly mis-fired and the CoreRT build scripts started using the Unix build scripts. After few hours of looking, turned out that old Windows SDKs define TARGETOS environment variable to "WINNT". I leave as an exercise to the reader what happens with this condition <TargetOS Condition="'$(TargetOS)' == ''">$(OS)</TargetOS>

teobugslayer avatar May 01 '20 17:05 teobugslayer

Another problem I could not solve by myself. I tried building ObjWriter (in order to try and add the missing msvc*.dll files) buf failed. Following these instructions, the file built, but tests were failing with this assertion:

---------------------------
Microsoft Visual C++ Runtime Library
---------------------------
Assertion failed!

Program: ...orert\bin\Windows_NT.x64.Debug\tools\objwriter.DLL
File: C:\Dev\corert\bin\obj\Native\Window...\Managed...tic.cpp
Line: 67

Expression: DeleterFn && "ManagedStatic not initialized correctly!"

For information on how your program can cause an assertion
failure, see the Visual C++ documentation on asserts

(Press Retry to debug the application - JIT must be enabled)
---------------------------
Abort   Retry   Ignore   
---------------------------

It's not critical for me, because I won't pursue this task more, but wanted to share the information in case you think it's actionable.

teobugslayer avatar May 01 '20 17:05 teobugslayer

@MichalStrehovsky @jkotas Can we not just use these? I mean they're preview, but they are made by Microsoft.

  • https://www.nuget.org/packages/Microsoft.Windows.SDK.CPP.x64
  • https://www.nuget.org/packages/Microsoft.Windows.SDK.CPP.arm64
  • https://www.nuget.org/packages/Microsoft.Windows.SDK.CPP.arm
  • https://www.nuget.org/packages/Microsoft.Windows.SDK.CPP.x86

mjsabby avatar Oct 14 '20 06:10 mjsabby

I think SourceLink is missing from lld, but the other checkboxes seem to be checked.

mjsabby avatar Oct 14 '20 06:10 mjsabby

@mjsabby Do you know what are those packages used for? I can't find mentions of them on the internet. I want to make sure that we would not take a dependency on some experimental packaging of a closed source project that can disappear. The program manager on the C++ team I would ask that seems to have left the company.

MichalStrehovsky avatar Oct 14 '20 07:10 MichalStrehovsky