Evaluate using LLVM BOLT for providing better packages performance
Hello.
According to the Facebook Research Paper (https://research.facebook.com/publications/bolt-a-practical-binary-optimizer-for-data-centers-and-beyond/), LLVM BOLT (https://github.com/llvm/llvm-project/blob/main/bolt/README.md) helps with achieving better performance for various packages like compilers and interpreters. I think it would be a good idea to enable LLVM BOLT for some packages to deliver faster binaries (or the possibility to recompile binaries with BOLT in an easier way) for users in Clear Linux since Clear Linux is all about performance.
Here I got some examples of how LLVM BOLT is already integrated into other projects:
- Rustc: https://github.com/rust-lang/rust/pull/116352
- CPython: https://github.com/python/cpython/pull/95908
- Pyston:
- https://github.com/pyston/pyston#building
- https://github.com/pyston/pyston/blob/pyston_main/Makefile#L200
- Clang: https://github.com/llvm/llvm-project/blob/main/clang/cmake/caches/BOLT.cmake
So at least for the projects above LLVM BOLT effects are tested and some preparations are already done in the upstream projects. In this case, it should be easier to enable BOLT for these packages.
For some projects right now there is ongoing work on integrating LLVM BOLT into the build scripts:
- Chromium: https://bugs.chromium.org/p/chromium/issues/detail?id=1163978
- Firefox: https://bugzilla.mozilla.org/show_bug.cgi?id=1789087
- The same for Propeller (a LLVM BOLT alternative): https://bugzilla.mozilla.org/show_bug.cgi?id=1509314
- NodeJS: https://github.com/nodejs/node/issues/50379
- LDC: https://github.com/ldc-developers/ldc/issues/4228 *GCC: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112492
More about LLVM BOLT performance results for other projects can be found in:
- Rustc:
- https://github.com/rust-lang/rust/pull/116352
- https://www.reddit.com/r/rust/comments/y4w2kr/llvm_used_by_rustc_is_now_optimized_with_bolt_on/
- CPython: https://github.com/python/cpython/pull/95908
- YDB: https://github.com/ydb-platform/ydb/issues/140
- Clang:
- LDC: https://github.com/ldc-developers/ldc/issues/4228#issuecomment-1334499428
- NodeJS: https://aaupov.github.io/blog/2020/10/08/bolt-nodejs
- Chromium: https://aaupov.github.io/blog/2022/11/12/bolt-chromium
- MySQL, MongoDB, memcached, Verilator: https://people.ucsc.edu/~hlitz/papers/ocolos.pdf
More information about the topic can be found here: https://github.com/zamazan4ik/awesome-pgo
I don't create an issue per project (like "Enable BOLT for Clang", "Enable BOLT for GCC", etc.) since I think first we need to discuss the approach. If we agree with enabling BOLT, then we can create an additional issue (and use this issue as a BOLT meta issue).
We have been looking at bolt for... quite some time, and even did a series of prototypes of things similar or prototype new optimizations inside bolt.
Bolt has some logistical issues to use it well ---- but what makes it slightly messy for us is that it still at times creates invalid output.
BUT -- we will be doing something bolt-like in the very near future (we're finishing up final pieces of it right now) that, while not bolt level, should get close, but with the logistics solved for a distro and without the risk of invalid output... We want to get this widely deployed in the OS still this year :)
On Sun, Nov 12, 2023 at 6:49 AM Alexander Zaitsev @.***> wrote:
Hello.
According to the Facebook Research Paper ( https://research.facebook.com/publications/bolt-a-practical-binary-optimizer-for-data-centers-and-beyond/), LLVM BOLT (https://github.com/llvm/llvm-project/blob/main/bolt/README.md) helps with achieving better performance for various packages like compilers and interpreters. I think it would be a good idea to enable LLVM BOLT for some packages to deliver faster binaries (or the possibility to recompile binaries with BOLT in an easier way) for users in Clear Linux since Clear Linux is all about performance.
Here I got some examples of how LLVM BOLT is already integrated into other projects:
- Rustc: rust-lang/rust#116352 https://github.com/rust-lang/rust/pull/116352
- CPython: python/cpython#95908 https://github.com/python/cpython/pull/95908
- Pyston:
- https://github.com/pyston/pyston#building
- https://github.com/pyston/pyston/blob/pyston_main/Makefile#L200
- Clang: https://github.com/llvm/llvm-project/blob/main/clang/cmake/caches/BOLT.cmake
So at least for the projects above LLVM BOLT effects are tested and some preparations are already done in the upstream projects. In this case, it should be easier to enable BOLT for these packages.
For some projects right now there is ongoing work on integrating LLVM BOLT into the build scripts:
- Chromium: https://bugs.chromium.org/p/chromium/issues/detail?id=1163978
- Firefox: https://bugzilla.mozilla.org/show_bug.cgi?id=1789087
- The same for Propeller (a LLVM BOLT alternative): https://bugzilla.mozilla.org/show_bug.cgi?id=1509314
- NodeJS: nodejs/node#50379 https://github.com/nodejs/node/issues/50379
- LDC: ldc-developers/ldc#4228 https://github.com/ldc-developers/ldc/issues/4228 *GCC: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112492
More about LLVM BOLT performance results for other projects can be found in:
- Rustc:
https://www.reddit.com/r/rust/comments/y4w2kr/llvm_used_by_rustc_is_now_optimized_with_bolt_on/
- rust-lang/rust#116352 https://github.com/rust-lang/rust/pull/116352
- CPython: python/cpython#95908 https://github.com/python/cpython/pull/95908
- YDB: ydb-platform/ydb#140 https://github.com/ydb-platform/ydb/issues/140
- Clang:
- Slides https://llvm.org/devmtg/2022-11/slides/Lightning15-OptimizingClangWithBOLTUsingCMake.pdf
- Results on building Clang https://github.com/ptr1337/llvm-bolt-scripts/blob/master/results.md
- Linaro results https://android-review.linaro.org/plugins/gitiles/toolchain/llvm_android/+/f36c64eeddf531b7b1a144c40f61d6c9a78eee7a
- on AMD 7950X3D https://github.com/llvm/llvm-project/issues/65010#issuecomment-1701255347
- LDC: ldc-developers/ldc#4228 (comment) https://github.com/ldc-developers/ldc/issues/4228#issuecomment-1334499428
- NodeJS: https://aaupov.github.io/blog/2020/10/08/bolt-nodejs
- Chromium: https://aaupov.github.io/blog/2022/11/12/bolt-chromium
- MySQL, MongoDB, memcached, Verilator: https://people.ucsc.edu/~hlitz/papers/ocolos.pdf
More information about the topic can be found here: https://github.com/zamazan4ik/awesome-pgo
I don't create an issue per project (like "Enable BOLT for Clang", "Enable BOLT for GCC", etc.) since I think first we need to discuss the approach. If we agree with enabling BOLT, then we can create an additional issue (and use this issue as a BOLT meta issue).
— Reply to this email directly, view it on GitHub https://github.com/clearlinux/distribution/issues/2996, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJ54FLPEKVAIS24XURCA7TYEDOYVAVCNFSM6AAAAAA7IB3ZOCVHI2DSMVQWIX3LMV43ASLTON2WKOZRHE4DSNBTGY3DSNA . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Thanks a lot for such valuable insights! I think we could leave the issue open, so when will be some progress on the topic - we can track it here.
yeah no problem -- and thanks for suggesting
we always are open and looking for more optimizations ... at least to evaluate of how practical/etc it is
On Sun, Nov 12, 2023 at 6:58 AM Alexander Zaitsev @.***> wrote:
Thanks a lot for such valuable insights! I think we could leave the issue open, so when will be some progress on the topic - we can track it here.
— Reply to this email directly, view it on GitHub https://github.com/clearlinux/distribution/issues/2996#issuecomment-1807151302, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJ54FJB53CK2AC2KG4WZW3YEDPX7AVCNFSM6AAAAAA7IB3ZOCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMBXGE2TCMZQGI . You are receiving this because you commented.Message ID: @.***>
Hi there,
Please report the cases of invalid output through LLVM issue tracker so we can address them: https://github.com/llvm/llvm-project/issues/new?labels=BOLT&assignees=aaupov&title=[BOLT].
Also interested to hear about your alternative solution, waiting for news coverage!
We have been looking at bolt for... quite some time, and even did a series of prototypes of things similar or prototype new optimizations inside bolt. Bolt has some logistical issues to use it well ---- but what makes it slightly messy for us is that it still at times creates invalid output. BUT -- we will be doing something bolt-like in the very near future (we're finishing up final pieces of it right now) that, while not bolt level, should get close, but with the logistics solved for a distro and without the risk of invalid output... We want to get this widely deployed in the OS still this year :)