corert
corert copied to clipboard
Remove dependency on link.exe and Windows SDK
Link.exe is not available as a standalone tool and gets bundled with the Windows SDK/DDK that is a huge download.
Investigate whether we can bundle LLD.
- [ ] make sure it can embed NatVis files when targeting Windows
- [ ] make sure it can generate all the debug records we emit
- [ ] make sure it supports SourceLink (I have a WIP SourceLink support in a branch that I can never get to - https://github.com/MichalStrehovsky/corert/commit/84e53e3eec9ff24e666eb55792de95eb4b4e6585)
- [ ] make sure it can generate import libraries out of DEF files a la
link.exe /lib /def:foo.def /machine:x64 /out:foo.lib(this is our way out of the "what kind of import libraries to specify" hell + gets rid of the Windows SDK dependency)
This is also a stepping stone to enable cross-compilation (e.g. target Windows from Linux and Linux from Windows).
@MichalStrehovsky Ive been wanting to do this for a while, but I think I finally figured it out. Look at this:
#!/bin/sh
curl -O https://download.visualstudio.microsoft.com/download/pr\
/3f2fc602-8afd-4687-a62a-80be4d6767f6\
/c9a22f5e1884344e9a2c7874e9feb473fd2faecb1e1fe71e8964c8d4c62658e5\
/Microsoft.VisualCpp.Tools.HostX64.TargetX64.vsix
unzip -d Tools Microsoft.VisualCpp.Tools.HostX64.TargetX64.vsix
fd 'exe$'
Result:
Tools\Contents\VC\Tools\MSVC\14.24.28314\bin\Hostx64\x64\bscmake.exe
Tools\Contents\VC\Tools\MSVC\14.24.28314\bin\Hostx64\x64\cl.exe
Tools\Contents\VC\Tools\MSVC\14.24.28314\bin\Hostx64\x64\cvtres.exe
Tools\Contents\VC\Tools\MSVC\14.24.28314\bin\Hostx64\x64\dumpbin.exe
Tools\Contents\VC\Tools\MSVC\14.24.28314\bin\Hostx64\x64\editbin.exe
Tools\Contents\VC\Tools\MSVC\14.24.28314\bin\Hostx64\x64\lib.exe
Tools\Contents\VC\Tools\MSVC\14.24.28314\bin\Hostx64\x64\link.exe
Tools\Contents\VC\Tools\MSVC\14.24.28314\bin\Hostx64\x64\ml64.exe
Tools\Contents\VC\Tools\MSVC\14.24.28314\bin\Hostx64\x64\mspdbcmf.exe
Tools\Contents\VC\Tools\MSVC\14.24.28314\bin\Hostx64\x64\mspdbsrv.exe
Tools\Contents\VC\Tools\MSVC\14.24.28314\bin\Hostx64\x64\nmake.exe
Tools\Contents\VC\Tools\MSVC\14.24.28314\bin\Hostx64\x64\undname.exe
Tools\Contents\VC\Tools\MSVC\14.24.28314\bin\Hostx64\x64\vcperf.exe
Tools\Contents\VC\Tools\MSVC\14.24.28314\bin\Hostx64\x64\vctip.exe
Tools\Contents\VC\Tools\MSVC\14.24.28314\bin\Hostx64\x64\xdcmake.exe
We also need .lib files from Windows SDK to link.
We also need
.libfiles from Windows SDK to link.
Ive been busy with other things, but I wanted to mention that I think I have an
answer for this as well. I believe the package in question would be
Win10SDK_10.0.18362 or similar and the installer in question would be
Windows SDK for Windows Store Apps Libs-x86_en-us.msi or similar. Here is some
output:
Windows Kits\10\Lib\10.0.18362.0\um\x64\icuin.Lib
Windows Kits\10\Lib\10.0.18362.0\um\x64\icuuc.lib
Windows Kits\10\Lib\10.0.18362.0\um\x64\inkobjcore.lib
Windows Kits\10\Lib\10.0.18362.0\um\x64\iphlpapi.lib
Windows Kits\10\Lib\10.0.18362.0\um\x64\kernel32.Lib
Windows Kits\10\Lib\10.0.18362.0\um\x64\kernel32legacylib.lib
Windows Kits\10\Lib\10.0.18362.0\um\x64\mbnapi.tlb
Windows Kits\10\Lib\10.0.18362.0\um\x64\mbnapi_uuid.lib
Windows Kits\10\Lib\10.0.18362.0\um\x64\mfreadwrite.lib
Windows Kits\10\Lib\10.0.18362.0\um\x64\mfsensorgroup.lib
https://github.com/cup/sunday/tree/master/visual-studio
One problem with grabbing these from various VSIX and MSI files is that there's no good way to automatically pull those files in. We either need some external prerequisite script that does this download (similar to installing VS, except maybe we need a bit less stuff), or the Microsoft.DotNet.ILCompiler package can do it automatically but then it gets into the business of managing external state outside NuGet (which is not a good place to be in).
I leaned towards trying things with LLD because that might be easier to just pack into a NuGet package. We would still be in the business of servicing LLD but maybe it's less work than link.exe.
Did you look at https://github.com/mstorsjo/llvm-mingw ? It's all self-contained and multiplatform too.
I’d like to try addressing this issue. Can you please give some pointers how to setup my machine for developing/testing Ilc? For example, how to setup test projects so that I can use the Ilc I am building and not the one from NuGet.
Thanks!
I think before starting anything, we need to have clarity on the lifecycle of the external bits - what's the source of the linker bits, how is the lifetime of it going to be managed, etc.
The existing situation, while not ideal, ensures that the linker and SDK is serviced (if there's e.g. a security issue, this is handled by the VS installer), and that it can be properly uninstalled. If we were to e.g. download linker from a URL on the first launch, we need to store the bits somewhere and suddenly we're in the business of managing the lifetime. Plus what if the first launch happens without network, or there's a network error - normally the network after "nuget restore" is no longer needed.
I don't have a good proposal/solution for this, unfortunately.
Good points.
First, installation of LLD must happen when Ilc is being installed. This makes sense, because this is the time when the user expects changes on their machine and we can reasonably expect to have working internet connection, because Ilc itself must be downloaded.
With my limited knowledge, I could think of the following solutions:
-
Distribute it as part of the Ilc NuGet package. LLD binaries are comitted in the CoreRT repo. Pros: lifetime and distribution are managed for us. Cons: Size - we add ~70MB to each and every ILC build. LLD is Apache 2 licensed so we are allowed to redistribute, however I am not sure if there are legal obstacles from Microsoft side to include external bits in the sources and the build.
-
Package LLD as a NuGet package and depend on it. Pros: Size price is paid only when LLD releases new version, which is relatively rare event (once per month or even rarer). Lifetime and distribution are again managed for us by NuGet. Cons: I have no idea how Ilc would find where LLD is on the disk to invoke it. Also, where would the NuGet package be hosted? I do not know NuGet enough to know if I can submit a non-dotnet package.
-
Keep the LLD bits somewhere and download them manually Pros: I do not see any Cons: Essentially we need to write a distribution system, which is clearly outside of the scope of writing a compiler. I do not know where to keep the binaries (the size means traffic, and traffic is $$$).
I am most in favor of option 2.
It might be worth investigating if having an LLD build without LTO support (which will not be needed and will be nonfunctional for CoreRT anyway) is feasible. That'd more than significantly decrease the LLD size.
By the way, I did a very dirty hack - I renamed lld-link.exe to link.exe, then renamed MS's link.exe to ms-link, checked with where link that indeed only my bad boy is on the path, ran dotnet publish and voila - got my Avalonia demo compiled and fully functional.
Edit: I tried debugging the produced exe in Visual Studio, it worked. Also, according to this commit, LLD supports natvis.
LLD is indeed production-ready for x86_32/x86_64/arm/arm64 on Windows at this point, and is pretty much a seamless replacement.
Kinda wonder if a CoreRT specific linker written in C# to avoid the redistribution issues would be worth it tho.
I think the starting point for this should a independent nuget package with LDD linker and allow CoreRT to use the linker from that package. One would need to specify the LDD linker nuget package as an additional reference, CoreRT package would not depend on it directly.
This setup would allow people who are on Visual C++ and happy with it to keep using that without paying for LDD, and people who would rather use LDD to use that instead.
@jkotas, I agree this is a practical approach. I did non-scientific measurements, and these are the performance results of the two linkers:
ms-link User Time: 8.7656250 seconds Kernel Time: 1.9687500 seconds Private Bytes: 612 765 696 Peak Private Bytes: 4 186 517 504 Working Set: 635 289 600 Peak Working Set: 4 983 738 368
LLD User Time: 6.4375000 seconds Kernel Time: 1.7812500 seconds Private Bytes: 315 346 944 Peak Private Bytes: 1 643 630 592 Working Set: 1 532 616 704 Peak Working Set: 4 029 952 000
Although raw numbers does not represent the feelings, lld feels so much faster and snappier, that I do not want to use ms-link anymore at all. So for me, the idea for using a separate NuGet dependency would be just a step towards making lld the default one.
That said, let's go back to writing code. I created a simple NuGet package which exports two msbuild properties: IlcLinker and CppLinker: https://github.com/teobugslayer/corert-lld
Then, I hacked Ilc's Microsoft.NETCore.Native.Windows.props with these modifications:
<CppLinker Condition="'CppLinker' == ''">link</CppLinker>
<!-- later to PropertyGroup Condition="'$(_VCVarsAllFound)' == '0'" -->
<!--<CppLinker>"$(_CppToolsDirectory)link.exe"</CppLinker>-->
This picked up my linker and successfully built my test project lith LLD from my nuget package.
My knowledge about hacking msbuild stops here. I cannot suggest the proper way for Ilc to pick up the externally defined msbuild property. Your suggestions are welcome.
I actually found a bug in Ilc while testing my setup :)
objwriter.dll depends on msvcp140.dll and vcruntime140.dll, which are not included. @MichalStrehovsky could you please fix this?
@teobugslayer Is requiring the Visual C++ redistributable a bug even? A lot of apps require it on Windows anyway. (that said recompiling that with /MT instead of /MD is possible)
@woachk I haven't stated what the fix would be. Changing of system requirements is a valid fix.
However, given the fact that Ilc already brings ucrtbase.dll and ms-api-* friends, I assume that creating a self-contained distribution was a goal for the project. I think this is better.
I found another problem - Ilc requires libcmt.lib, libcpmt.lib, and oldnames.lib, which come only with the VC compiler toolset. I am not sure if we can work-around these.
To summarize my thoughts on the issue. What work is needed to make building projects with CoreRT dependant only on .Net Core SDK and no other pre-installed packages?
-
[ ] dependency on Windows SDK for import libraries such as kernel32.lib, and the CRT Can be solved by generating manually lib files from existing installation of Windows and distributing these as a separate NuGet package
-
[ ] dependency on Microsoft Visual C++ redist Can be solved by distributing the required msvcp* files in the Ilc NuGet package
-
[ ] dependency on static version of CRT Cannot be legally solved, as long as we require the MS toolchain
-
[ ] dependency on the Microsoft build toolchain, esp. the linker Can be solved by distributing the LLVM linker LLD in a separate NuGet package
Given that currently we cannot avoid using the static CRT, i think the last point is only marginably important - LLD reduces build times, but this is a nice-to-have feature. However, removing the need for the Windows SDK is very beneficial, especially for people with small SSDs, like me.
Thoughts? Should I persue this further?
However, removing the need for the Windows SDK is very beneficial, especially for people with small SSDs, like me.
It also avoids the problem with forgetting to specify LIB files for APIs that are hard bound. I've been wondering whether it would make sense to add a compiler option to dump all hardbound p/invokes into a file (could be as simple as a flat list of module names and procedure names), and then add an MSBuild task that runs before the linker that goes over this file and generates DEF files for the individual libraries. The add some MSBuild to run link /lib /machine:x64 /def:foo.def /out:foo.lib to generate a LIB file out of each.
The format of the DEF files that link.exe accepts for this is:
LIBRARY foo
EXPORTS
Bar
Baz
We already have some MSBuild tasks that support the compiler so the building blocks are there. @jkotas what do you think?
I like this idea.
As for the format of the file listing all the pinvokes, we can just use the same format that IL Linker produces (mono/linker#992). I don't have a specific reason for why, but we need a format and one was already invented. I wouldn't bother with the full name and assembly field for now, so just:
[
{
"entryPoint": "CustomEntryPoint",
"moduleName": "lib_copyassembly"
},
{
"entryPoint": "FooEntryPoint",
"moduleName": "lib_copyassembly"
}
]
@teobugslayer if you're interested in implementing this, I can try to give you some pointers.
Sure. Nothing you said makes sense to me, so let's see how far we will go.
We are also going to need the right symbols in the .lib to satisfy the dependencies of the unmanaged portion of the runtime, and maybe CRT too. I am wondering whether it would be better to start by having a checked in file that has all kernel32, etc. exports from Windows 7. We can use such list to generate the .lib, but also to do more aggressive hard-binding to PInvokes (ie use it for PInvoke configuration).
As for the format of the file listing all the pinvokes, we can just use the same format that IL Linker produces
We ultimately need the .def file that Windows linker understands. What's the advantage of generating this intermediate format and have another step that takes the intermediate format to generate what we need vs. just generating what we need directly?
We are also going to need the right symbols in the .lib to satisfy the dependencies of the unmanaged portion of the runtime, and maybe CRT too.
Mmmm, yeah, that's a bit annoying. Maybe having the compiler consume DEF files would be a better approach indeed because as you said - it can also be used to implement #2454 (which is trying to figure out a way to control whether a DllImport should directly expect an external symbol to be provided at link time, or whether we should do LoadLibrary/GetProcAddress (dlopen/dlsym) at runtime to resolve the import). The only drawback of using this to implement #2454 is that we would be using a Windows-style file format outside Windows too, but maybe that's fine.
What's the advantage of generating this intermediate format and have another step that takes the intermediate format to generate what we need vs. just generating what we need directly?
It's annoying to have a tool invocation within MSBuild that produces multiple outputs. And it felt like having a task do that would be more convenient. But it's irrelevant if we go with the "consume DEF file" approach.
I continued my tests, and it turns out that we need Windows 10 SDK because it provides the static CRT - libucrt.lib.
I still added a set of manually generated import LIB files into my repo.
I still do not understand what @MichalStrehovsky and @jkotas are talking about, but at least I proved that the initial issue (remove dependency on WIndows 10 SDK) is unfeasible and cannot be (legally) solved.
Of course, IANAL, and the terms on https://docs.microsoft.com/en-us/legal/windows-sdk/redist may actually allow us to redist these files.
Funny side note: As parts of my experiments, I tried using Windows 2003 SDK. It spectacularly mis-fired and the CoreRT build scripts started using the Unix build scripts. After few hours of looking, turned out that old Windows SDKs define TARGETOS environment variable to "WINNT". I leave as an exercise to the reader what happens with this condition <TargetOS Condition="'$(TargetOS)' == ''">$(OS)</TargetOS>
Another problem I could not solve by myself. I tried building ObjWriter (in order to try and add the missing msvc*.dll files) buf failed. Following these instructions, the file built, but tests were failing with this assertion:
---------------------------
Microsoft Visual C++ Runtime Library
---------------------------
Assertion failed!
Program: ...orert\bin\Windows_NT.x64.Debug\tools\objwriter.DLL
File: C:\Dev\corert\bin\obj\Native\Window...\Managed...tic.cpp
Line: 67
Expression: DeleterFn && "ManagedStatic not initialized correctly!"
For information on how your program can cause an assertion
failure, see the Visual C++ documentation on asserts
(Press Retry to debug the application - JIT must be enabled)
---------------------------
Abort Retry Ignore
---------------------------
It's not critical for me, because I won't pursue this task more, but wanted to share the information in case you think it's actionable.
@MichalStrehovsky @jkotas Can we not just use these? I mean they're preview, but they are made by Microsoft.
- https://www.nuget.org/packages/Microsoft.Windows.SDK.CPP.x64
- https://www.nuget.org/packages/Microsoft.Windows.SDK.CPP.arm64
- https://www.nuget.org/packages/Microsoft.Windows.SDK.CPP.arm
- https://www.nuget.org/packages/Microsoft.Windows.SDK.CPP.x86
I think SourceLink is missing from lld, but the other checkboxes seem to be checked.
@mjsabby Do you know what are those packages used for? I can't find mentions of them on the internet. I want to make sure that we would not take a dependency on some experimental packaging of a closed source project that can disappear. The program manager on the C++ team I would ask that seems to have left the company.