gentooLTO icon indicating copy to clipboard operation
gentooLTO copied to clipboard

enable LTO for chromium

Open Hello71 opened this issue 7 years ago • 92 comments

it is enabled for official builds upstream since 2016, but Gentoo does not use it, even if -flto is specified.

Hello71 avatar Jun 23 '18 23:06 Hello71

I've been building successfully with USE custom-cflags, which should disable the flag stripping, although I haven't verified that LTO is being enabled

wolfwood avatar Jun 25 '18 19:06 wolfwood

Anyone willing to do a PR? I don't use Chromium myself and I remember it took forever to compile.

InBetweenNames avatar Jun 25 '18 19:06 InBetweenNames

@InBetweenNames is there a way to confirm a binary has been LTO'd or do you look at the build logs?

wolfwood avatar Jun 25 '18 21:06 wolfwood

I just look at the build logs, however if there are any static libraries, you can objdump them to see if there are any GIMPLE symbols in the output. Example from sys-devel/flex:

>qfile /usr/lib64/libfl.a
...
libyywrap.o:     file format elf64-x86-64

SYMBOL TABLE:
0000000000000000 l    d  .text  0000000000000000 .text
0000000000000000 l    d  .data  0000000000000000 .data
0000000000000000 l    d  .bss   0000000000000000 .bss
0000000000000000 l    d  .gnu.lto_.profile.2caf6e470d364625     0000000000000000 .gnu.lto_.profile.2caf6e470d364625
0000000000000000 l    d  .gnu.lto_.icf.2caf6e470d364625 0000000000000000 .gnu.lto_.icf.2caf6e470d364625
0000000000000000 l    d  .gnu.lto_.jmpfuncs.2caf6e470d364625    0000000000000000 .gnu.lto_.jmpfuncs.2caf6e470d364625
...
...

The .gnu.lto_. part is the GIMPLE from GCC. However, shared objects won't have any GIMPLE associated with them, so executables and .so files won't look any different.

InBetweenNames avatar Jun 27 '18 18:06 InBetweenNames

but if they're unstripped and you're lucky you can find LTO function fragments.

Hello71 avatar Jun 27 '18 18:06 Hello71

  1. install sys-devel/lld
  2. set EXTRA_GN="thin_lto_enable_optimizations=true use_lld=true use_thin_lto=true"

alternatively, simply:

  1. set sys-devel/chromium USE=custom-cflags

Hello71 avatar Sep 23 '18 02:09 Hello71

OK, I finally built it successfully. this does work. note that it significantly increases CPU, RAM, and disk usage during build. it (very very approximately) doubles each of these.

Hello71 avatar Oct 12 '18 20:10 Hello71

@Hello71, were you able to just use custom-cflags or did it require more in-depth workarounds?

InBetweenNames avatar Oct 13 '18 14:10 InBetweenNames

I didn't use custom-cflags, I used the gn options. I assume this is better supported upstream.

Hello71 avatar Oct 14 '18 13:10 Hello71

Ye I opened a bug report about that sever months ago because I noticed that in benchmark Chromium compiled in gentoo was slower than binary Chrome or Arch linux Chromium. The problems are that you should use latest clang, LLD and do an official_build or enable the LTO flags, also with LTO it takes much longer and you'll probably need a very strong/server machine, at least 16GB of ram.

I also tried to compile Chromium with GCC using fedora's patches but it was even slower in benchmarks.

funghetto avatar Nov 01 '18 14:11 funghetto

@funghetto How much slower, and what was slower, JS? I only compile Chromium with GCC and have had no issue with performance with custom cflags.

sjnewbury avatar Nov 07 '18 07:11 sjnewbury

I've used mainly Speedometer (https://browserbench.org/Speedometer2.0/) and there was like a 15%-20% performance drop there.

funghetto avatar Nov 07 '18 16:11 funghetto

USE=custom-cflags compilation works for me by changing the compiler to clang, linker to lld and changing -flto=n (which isn't a valid option in clang) to -flto. Is that something to make a PR for, or do we avoid replacing gcc with clang (since I haven't seen this as something suggested anywhere nor noticed such workarounds)? @InBetweenNames

mateuszmandera avatar Nov 28 '18 18:11 mateuszmandera

While it is not about Chromium but Firefox, there is a great blog post of a GCC developer with some interesting data points on the Clang vs. GCC debate: http://hubicka.blogspot.com/2018/12/firefox-64-built-with-gcc-and-clang.html

ms178 avatar Dec 23 '18 23:12 ms178

While it is not about Chromium but Firefox, there is a great blog post of a GCC developer with some interesting data points on the Clang vs. GCC debate: http://hubicka.blogspot.com/2018/12/firefox-64-built-with-gcc-and-clang.html

For Firefox at this point I've given up and switched to clang+lto+pgo mostly because it's more tested (being default). Firefox source code has a decent amount of ambiguous code that easily breaks with many optimizations from one version to the next and it's often subtle issues rather than a segfault, and most are long-standing known issues that aren't getting fixed anytime soon, only gcc 6 is officially supported too.

Shame the ebuild doesn't have a flag for pgo right now though, but it's just about setting MOZ_PGO=1.

Even if gcc+lto+pgo performs better, I'm not sure I want to deal with this anymore unless it gets some official support effort.

ionenwks avatar Dec 24 '18 02:12 ionenwks

In the short term, you can use his binary instead. :)

I have uploaded a binary build with GCC 8, with link-time optimization and profile feedback. If your curiosity exceeds the fear of running random binaries from the net, you are welcome to try it out. It is built from Firefox 64 release. You can compare it to the official build and build provided by your favourite distro.

ms178 avatar Dec 24 '18 10:12 ms178

I've managed to builf Chromium with LTO

www-client/chromium-76.0.3809.100::gentoo was built with the following:
USE="cups custom-cflags jumbo-build (pic) proprietary-codecs pulseaudio suid system-ffmpeg system-icu system-libvpx tcmalloc -closure-compile -component-build -gnome-keyring -hangouts -kerberos (-selinux) -widevine" L10N="ru -am -ar -bg -bn -ca -cs -da -de -el -en-GB -es -es-419 -et -fa -fi -fil -fr -gu -he -hi -hr -hu -id -it -ja -kn -ko -lt -lv -ml -mr -ms -nb -nl -pl -pt-BR -pt-PT -ro -sk -sl -sr -sv -sw -ta -te -th -tr -uk -vi -zh-CN -zh-TW"
CFLAGS="-march=native -mtune=native -O2 -pipe -fomit-frame-pointer -fno-plt -fno-stack-protector -ftree-vectorize -s"
CXXFLAGS="-march=native -mtune=native -O2 -pipe -fomit-frame-pointer -fno-plt -fno-stack-protector -ftree-vectorize -s -Wno-pedantic -Wno-unused-result -Wno-unused-function -Wno-unused-variable -Wno-unused-but-set-variable -Wno-deprecated-declarations -Wno-return-type -Wno-parentheses -Wno-misleading-indentation -Wno-attributes -Wno-subobject-linkage -Wno-ignored-attributes -Wno-ignored-attributes -Wno-address -Wno-dangling-else -Wno-class-memaccess -Wno-invalid-offsetof -Wno-packed-not-aligned"
FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs ccache config-protect-if-modified distlocks ebuild-locks fixlafiles ipc-sandbox merge-sync multilib-strict network-sandbox news parallel-fetch parallel-install pid-sandbox preserve-libs protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr"
LDFLAGS="-Wl,-O2 -Wl,--as-needed -Wl,--sort-common -Wl,--strip-debug -flto"

perfect7gentleman avatar Aug 19 '19 11:08 perfect7gentleman

with clang? or gcc ? @perfect7gentleman

javashin avatar Aug 19 '19 18:08 javashin

GCC. I can't build properly working Chromium with Clang.

perfect7gentleman avatar Aug 20 '19 00:08 perfect7gentleman

good thanks @perfect7gentleman for the info

javashin avatar Aug 20 '19 06:08 javashin

I've used some Dedian and OpenSUSE patches

perfect7gentleman avatar Aug 20 '19 10:08 perfect7gentleman

EXTRA_GN="\
is_debug=false \
use_goma=false \
use_ozone=false \
use_sysroot=false \
use_openh264=false \
use_libjpeg_turbo=true \
use_custom_libcxx=false \
use_gnome_keyring=false \
use_unofficial_version_number=false \
enable_vr=false \
enable_nacl=false \
enable_nacl_nonsfi=false \
enable_swiftshader=false \
enable_reading_list=false \
enable_one_click_signin=false \
enable_iterator_debugging=false \
enable_hangout_services_extension=false \
optimize_webui=true \
treat_warnings_as_errors=false \
linux_use_bundled_binutils=false \
\
use_gio=false \
link_pulseaudio=true \
enable_widevine=false \
v8_enable_backtrace=true \
use_system_zlib=true \
use_system_lcms2=true \
use_system_libjpeg=true \
use_system_freetype=true \
use_system_harfbuzz=true \
use_system_libopenjpeg2=true \
use_jumbo_build=true \
proprietary_codecs=true \
ffmpeg_branding=\"Chrome\" \
fieldtrial_testing_like_official_build=true \
\
\
use_aura=true \
symbol_level=0 \
use_kerberos=false \
fatal_linker_warnings=false \
use_gnome_keyring=false \
use_vaapi=true \
use_dbus=true \
enable_hevc_demuxing=true \
enable_mus=true \
gcc_lto=true"
CCACHE_SLOPPINESS="time_macros"
~ $ ls /etc/portage/patches/www-client/chromium
001-libcxx.patch            013-inspector._patch_      021-widevine-locations.patch  029-openh264.patch      036-google-api-warning.patch    044-deprecated.patch        051-null-destination.patch     058-nspr.patch        102-chromium-dma-buf.patch        125-chromium-vaapi.patch
002-parallel.patch          014-gpu-timeout.patch      023-connection-message.patch  030-chromeos.patch      037-third-party-cookies.patch   045-bool-compare.patch      052-int-in-bool-context.patch  059-zlib.patch        103-chromium-buildname.patch
003-gcc_skcms_ice.patch     015-empty-array.patch      024-unrar.patch               031-perfetto.patch      038-device-notifications.patch  046-enum-compare.patch      053-vpx.patch                  060-event._patch_     104-chromium-drm.patch
004-pffffft-buildfix.patch  016-safebrowsing.patch     025-signin.patch              032-installer.patch     040-friend.patch                047-sign-compare.patch      054-icu.patch                  061-ffmpeg.patch      105-chromium-sandbox-pie.patch
009-mojo.patch              017-sequence-point.patch   026-android.patch             033-font-tests.patch    041-printf.patch                048-initialization.patch    055-gtk2.patch                 062-jsoncpp._patch_   107-chromium-system-libusb.patch
011-ps-print.patch          018-jumbo-namespace.patch  027-fuzzers.patch             034-swiftshader.patch   042-attribute.patch             049-unused-typedefs.patch   056-jpeg.patch                 063-openjpeg.patch    109-gcc-lto-rsp-clobber.patch
012-as-needed.patch         019-template-export.patch  028-tracing.patch             035-welcome-page.patch  043-multichar.patch             050-unused-functions.patch  057-lcms.patch                 064-convertutf.patch  110-gcc-enable-lto.patch

perfect7gentleman avatar Aug 20 '19 10:08 perfect7gentleman

On the other hand, I can't compile chromium with gcc 9 and lto:

[4259/19949] rm -f obj/third_party/blink/public/mojom/libweb_feature_mojo_bindings_mojom.a && "x86_64-pc-linux-gnu-ar" -T -r -c -s -D obj/third_party/blink/public/mojom/libweb_feature_mojo_bindings_mojom.a @"obj/third_party/blink/public/mojom/libweb_feature_mojo_bindings_mojom.a.rsp"
[4260/19949] x86_64-pc-linux-gnu-g++ -fPIC -Wl,-z,noexecstack -Wl,-z,relro -Wl,-z,now -Wl,-z,defs -Wl,--as-needed -rdynamic -pie -Wl,--disable-new-dtags -Wl,-O1 -Wl,--as-needed -O2 -march=native -falign-functions=32 -O3 -fgraphite-identity -floop-nest-optimize -fdevirtualize-at-ltrans -fipa-pta -fno-semantic-interposition -flto=1 -fuse-linker-plugin -o "./character_data_generator" -Wl,--start-group @"./character_data_generator.rsp"  -Wl,--end-group  -latomic -ldl -lpthread -lrt -lgmodule-2.0 -lgobject-2.0 -lgthread-2.0 -lglib-2.0 -licui18n -licuuc -licudata
FAILED: character_data_generator 
x86_64-pc-linux-gnu-g++ -fPIC -Wl,-z,noexecstack -Wl,-z,relro -Wl,-z,now -Wl,-z,defs -Wl,--as-needed -rdynamic -pie -Wl,--disable-new-dtags -Wl,-O1 -Wl,--as-needed -O2 -march=native -falign-functions=32 -O3 -fgraphite-identity -floop-nest-optimize -fdevirtualize-at-ltrans -fipa-pta -fno-semantic-interposition -flto=1 -fuse-linker-plugin -o "./character_data_generator" -Wl,--start-group @"./character_data_generator.rsp"  -Wl,--end-group  -latomic -ldl -lpthread -lrt -lgmodule-2.0 -lgobject-2.0 -lgthread-2.0 -lglib-2.0 -licui18n -licuuc -licudata
during IPA pass: pta
lto1: internal compiler error: Segmentation fault
Please submit a full bug report,
with preprocessed source if appropriate.
See <https://bugs.gentoo.org/> for instructions.
lto-wrapper: fatal error: x86_64-pc-linux-gnu-g++ returned 1 exit status
compilation terminated.
/usr/lib/gcc/x86_64-pc-linux-gnu/9.2.0/../../../../x86_64-pc-linux-gnu/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status
ninja: build stopped: subcommand failed.
 * ERROR: www-client/chromium-79.0.3945.29::gentoo failed (compile phase):
 *   ninja -v -j1 -l1 -C out/Release v8_context_snapshot_generator failed

ElDavoo avatar Nov 15 '19 07:11 ElDavoo

the problem was actually -fipa-pta so I disabled and tried to compile without it. ICE (maybe on the same file too) in the Graphite stage. I'm now trying to build it without Graphite...

ElDavoo avatar Nov 16 '19 00:11 ElDavoo

Hmm... That's interesting. I have never built Chrome(ium) though, since LibreOffice and QtWebengine take long enough already and my primary browser is Firefox Nightly.

elsandosgrande avatar Nov 17 '19 07:11 elsandosgrande

Thing is tests are more difficult due to the big nature of this package. A single test can take 1-2 days if you compile with a low powered machine. Anyway, no graphite allowed me to get further, but then I got this error.

[7522/19949] python ../../tools/protoc_wrapper/protoc_wrapper.py crx3.proto --protoc ./protoc --proto-in-dir ../../components/crx_file --cc-out-dir gen/components/crx_file --py-out-dir pyproto/components/crx_file
FAILED: pyproto/components/crx_file/crx3_pb2.py gen/components/crx_file/crx3.pb.h gen/components/crx_file/crx3.pb.cc 
python ../../tools/protoc_wrapper/protoc_wrapper.py crx3.proto --protoc ./protoc --proto-in-dir ../../components/crx_file --cc-out-dir gen/components/crx_file --py-out-dir pyproto/components/crx_file
terminate called after throwing an instance of 'std::system_error'
  what():  Unknown error -1
Protoc has returned non-zero status: -6

Since this error does not give any clue on what is going wrong, I'm now trying to compile with USE="-custom-cflags".

ElDavoo avatar Nov 17 '19 11:11 ElDavoo

Thing is tests are more difficult due to the big nature of this package. A single test can take 1-2 days if you compile with a low powered machine. Anyway, no graphite allowed me to get further, but then I got this error.

[7522/19949] python ../../tools/protoc_wrapper/protoc_wrapper.py crx3.proto --protoc ./protoc --proto-in-dir ../../components/crx_file --cc-out-dir gen/components/crx_file --py-out-dir pyproto/components/crx_file
FAILED: pyproto/components/crx_file/crx3_pb2.py gen/components/crx_file/crx3.pb.h gen/components/crx_file/crx3.pb.cc 
python ../../tools/protoc_wrapper/protoc_wrapper.py crx3.proto --protoc ./protoc --proto-in-dir ../../components/crx_file --cc-out-dir gen/components/crx_file --py-out-dir pyproto/components/crx_file
terminate called after throwing an instance of 'std::system_error'
  what():  Unknown error -1
Protoc has returned non-zero status: -6

Since this error does not give any clue on what is going wrong, I'm now trying to compile with USE="-custom-cflags".

I could test it for you, it takes hour or so to build but I gave up on compiling with gcc. Chromium uses clang specific optimizations/patches which result in significant performance differences [ 20%~ ]

zkvsky avatar Nov 17 '19 15:11 zkvsky

Chromium uses clang specific optimizations/patches which result in significant performance differences [ 20%~ ]

Can you demonstrate that? If that's true, we don't we force chromium to use clang as compiler?

ElDavoo avatar Nov 17 '19 16:11 ElDavoo

Chromium uses clang specific optimizations/patches which result in significant performance differences [ 20%~ ]

Can you demonstrate that? If that's true, we don't we force chromium to use clang as compiler?

Chrome/ium uses clangs -fwhole-program-vtables for performance boost amongst other things, "Skia contains SSE and AVX optimized rendering routines which are written using Clang only vector extensions." From Honza Hubickas blog [ GCC dev ] If you need more stuff I'll link it later

As to why it isn't used? Nobody cares about compiling chromium from source with the exception of niche Gentoo user. Gentoo isn't particularly bleeding edge at its core nor it cares about chromium, clang that much. There are talks but it moves at snail's pace

I've seen some ebuilds in overlays with clang as an option but most of them disappeared recently, chromium is a fast moving target and stuff around it adapts slowly

Firefox upstream is also compiled with clang, just FYI, and clang version had double rendering performance for some time [ which was fixed/patched ]

zkvsky avatar Nov 17 '19 16:11 zkvsky

Chrome/ium uses clangs -fwhole-program-vtables for performance boost amongst other things, "Skia contains SSE and AVX optimized rendering routines which are written using Clang only vector extensions." From Honza Hubickas blog [ GCC dev ]

After reading about this I decided to give clang another try. I am able to build chromium fine with GCC 9.2 using LTO, but when I attempt to use clang, it fails very early in the build process, and I haven't been able to figure out why. Here is the error I get:

[80/808] clang++ -pie -fPIC -Wl,-z,noexecstack -Wl,-z,relro -Wl,-z,now -Wl,-z,defs -Wl,--as-needed -fuse-ld=lld -Wl,--icf=all -Wl,--color-diagnostics -flto=thin -Wl,--thinlto-jobs=8 -Wl,--thinlto-cache-dir=thinlto-cache -Wl,--thinlto-cache-policy,cache_size=10\%:cache_size_bytes=10g:cache_size_files=100000 -Wl,--lto-O2 -fwhole-program-vtables -rdynamic -pie -Wl,--disable-new-dtags -Wl,-O1 -Wl,--as-needed -Wl,--hash-style=gnu -Wl,-z,relro -Wl,-z,now -flto=thin -O3 -pipe -march=bdver4 -fstack-check -o "./bytecode_builtins_list_generator" -Wl,--start-group @"./bytecode_builtins_list_generator.rsp"  -Wl,--end-group  -latomic -ldl -lpthread -lrt
[81/808] clang++ -pie -fPIC -Wl,-z,noexecstack -Wl,-z,relro -Wl,-z,now -Wl,-z,defs -Wl,--as-needed -fuse-ld=lld -Wl,--icf=all -Wl,--color-diagnostics -flto=thin -Wl,--thinlto-jobs=8 -Wl,--thinlto-cache-dir=thinlto-cache -Wl,--thinlto-cache-policy,cache_size=10\%:cache_size_bytes=10g:cache_size_files=100000 -Wl,--lto-O2 -fwhole-program-vtables -rdynamic -pie -Wl,--disable-new-dtags -Wl,-O1 -Wl,--as-needed -Wl,--hash-style=gnu -Wl,-z,relro -Wl,-z,now -flto=thin -O3 -pipe -march=bdver4 -fstack-check -o "./gen-regexp-special-case" -Wl,--start-group @"./gen-regexp-special-case.rsp"  -Wl,--end-group  -latomic -ldl -lpthread -lrt -licui18n -licuuc -licudata
[82/808] python ../../v8/tools/run.py ./gen-regexp-special-case gen/v8/src/regexp/special-case.cc
[83/808] touch obj/v8/run_gen-regexp-special-case.stamp
[84/808] python ../../v8/tools/run.py ./bytecode_builtins_list_generator gen/v8/builtins-generated/bytecodes-builtins-list.h
FAILED: gen/v8/builtins-generated/bytecodes-builtins-list.h 
python ../../v8/tools/run.py ./bytecode_builtins_list_generator gen/v8/builtins-generated/bytecodes-builtins-list.h
ninja: build stopped: subcommand failed.

I tried enabling FEATURES=-fail-clean and then going into the build directory and running that command manually (python ../../v8/tools/run.py ./bytecode_builtins_list_generator gen/v8/builtins-generated/bytecodes-builtins-list.h), and it worked fine, or at least returned exit status 0 and produced a file with the expected name in the expected location. So, something is different somehow in the portage environment that is making it fail, and the error only affects running bytecode_builtins_list_generator, not building it. The step that fails doesn't actually invoke either gcc or clang, so it seems strange, since even though the binary that fails to run was compiled with clang and not gcc, it runs fine outside of the portage environment, and the only changes I made to the portage environment in between were related to the switch from gcc to clang.

The changes I made to the portage environment since successfully building with gcc are as follows:

  1. Enable clang, using a file in /etc/portage/env that looks like this:
CC="clang"
CXX="clang++"
LD="ld.lld"
AR="llvm-ar"
NM="llvm-nm"
RANLIB="llvm-ranlib"
CFLAGS="${CFLAGS} -flto=thin"
CXXFLAGS="${CXXFLAGS} -flto=thin"
LDFLAGS="${LDFLAGS} -flto=thin"
  1. Created /etc/portage/env/www-client/chromium as follows:
EXTRA_GN="thin_lto_enable_optimizations=true use_lld=true use_thin_lto=true"
  1. Enabled the custom-cflags USE flag, although the only difference it seems to make is that it prevents -O3 from being replaced by -O2. Without this USE flag, the build fails in the exact same way.

automorphism88 avatar Nov 20 '19 20:11 automorphism88