llvmlite icon indicating copy to clipboard operation
llvmlite copied to clipboard

Include compiler-rt in the JIT environment (with fix proposition)

Open aguinet opened this issue 5 years ago • 9 comments

Hello everyone!

compiler-rt functions aren't included with llvmlite. This makes generated code that relies on them not to work.

Reproducer

One example is the LLVM IR here: https://pastebin.com/SNgXV0Ph . It has been generated using clang 8 from the following code under Linux/x86_64: https://pastebin.com/k5tJ5NiT .

With the Python script here (https://pastebin.com/WcAy4BSv) and llvmlite 0.29.0, we get a segfault. Indeed, the __udivti3 generated by clang (cf. https://godbolt.org/z/AbZ_92) is part of compiler-rt. As the LLVM dynamic loader can't find this function loaded within the process, it doesn't fix the mov rXX, 0 instruction that is supposed to get the pointer to this function, and we end up with a call to 0.

This test case has been extracted from a unit case of the Miasm project (https://github.com/cea-sec/miasm/blob/master/test/arch/x86/unit/mn_div.py).

Note that this issue arises under Windows and Linux x86/64.

Fix proposal 1

One way to fix this is to use llvm::ExecutionEngine::addArchive, with a pre-compiled compiler-rt/builtins static library. I made a quick'n'dirty POC here: https://github.com/aguinet/llvmlite/commit/89862c086b3120b30fa67a9aab0c8dc904d02952 . This indeed fixes the issue for Linux. I still have to try this under Windows.

Fix proposal 2

One other fix that could work is to statically compile compiler-rt into the llvmlite .so/.dll binary. We can't use compiler-rt "as such" because we need to modify its CMake system a little bit to generate an archive with exported symbols. I tried an alternative of this under Linux by generating a .so file with the builtins symbols exported, and LD_PRELOAD it. This also fixes the bug. I don't if this would work on Windows though.

Conclusion

Which fix would you prefer see implemented? Do you have other suggestions?

Thanks :)

aguinet avatar Jun 24 '19 18:06 aguinet

It's never clear to me if compiler-rt works or is needed on Windows. See http://lists.llvm.org/pipermail/cfe-dev/2018-January/056517.html and https://reviews.llvm.org/D41813.

In numba, we avoided using compiler-rt by implementing some of the builtins. We should re-evaluate the status of compiler-rt.

sklam avatar Jun 24 '19 19:06 sklam

I just installed LLVM8 under Windows, and clang_rt.builtins-x86_64.lib is provided. I will try the addArchive thing on Windows later on, that might work out-of-the-box. The question would be how to ship this with llvmlite...

aguinet avatar Jun 24 '19 19:06 aguinet

Notes from talking with numba core-devs:

Proposal 1 is probably the easiest. We can ship the builtins static library in llvmlite. But we will need to verify if it is compatible with manylinux wheels. Also, compiler-rt has the sanitizers. Those can live in separate packages.

sklam avatar Jul 02 '19 12:07 sklam

Thanks @sklam for the heads up! For information, I tried using addArchive under Windows, and there is an issue.

MCJIT under Windows generates ELF object, because at first there was only a runtime dynamic loader for ELF. As far as I can see, LLVM8 also provides one for COFF, but I can't manage to have MCJIT using COFF objects. So, addArchive doesn't work because clang_rt.builtins.obj is a COFF file. One solution is to generate an ELF (but that requires hacking into compiler-rt's build system), the other is try and have MCJIT using COFF files under Windows. I'll ask on llvm-dev I think, I'll keep you in touch!

aguinet avatar Jul 02 '19 13:07 aguinet

I've asked the author of the COFF runtime dynamic loader if that we could use it under Windows for jitting purposes, and here are parts of his answer (quoted with authorization from himself) describing the potential problems:

One is that (the contributions from) object files in a fully-bound image expect to be located within 4GB of each other, and compilers depend on this and will use special image-based relocations to save space. EG using a 32 bit displacement off of some image-relative base. So the dynamic loader has to guarantee this too. Since there is no linking stage, this effectively means all dynamically loaded COFF objects need to be in the same 4GB region.

Another is that there are often implicit DLL imports in object files. Usually the linker will resolve these against special "stub" archive libraries representing DLL exports, and there are often special linker hints embedded in the object to request specific stub archives be present when linking. These imports may not be reachable by 32 bit offsets so the linker will add indirection cells (at least for code). So all this processing also needs to be replicated.

Beyond this there are further challenges, say getting things like C++ static initializers to work as expected...

It looks like we have to stick with ELF under Windows. The easiest thing might as well to compile compiler-rt within the llvmlite FFI library...

aguinet avatar Aug 16 '19 06:08 aguinet

For what it worths, we tried this hack for Miasm: https://github.com/cea-sec/miasm/pull/1153/files

Maybe the easiest thing is to include these functions directly within llvmlite?

(I can make the PR if that's fine for you)

aguinet avatar May 22 '20 21:05 aguinet

This is basically proposal 2 :)

aguinet avatar May 22 '20 21:05 aguinet

mark. I am willing to join you to solve the f16 problem

dongrixinyu avatar Feb 08 '23 08:02 dongrixinyu

linking to https://numba.discourse.group/t/add-support-for-compiler-rt/1780. @testhound has been working on compiler-rt lately

sklam avatar Feb 13 '23 14:02 sklam