gentooLTO
gentooLTO copied to clipboard
Building the Linux kernel using LTO
I find it interesting that there hasn't been more push to build the kernel using LTO. I've found a couple of mailing list threads about it, including a patchset to let it happen, but there wasn't a lot of interest upstream. I've created this issue as a way to track what the current LTO progress in the kernel is, and possibly even add some patchsets to let it happen. I know I'd for sure use it on my router if I could with OpenWRT.
@InBetweenNames I would use it on my router too, I think at the time the gcc LTO toolchain wasn't very mature and few were able too make much use of it, particularly embedded* Linux where there would be most interest. Without that buy-in the kernel devs weren't going to let the patches in.
Perhaps resurrecting the patch set and getting it working again could be successful now that lto support is pretty ubiquitous in distros and most embedded devs must be using it by now for their user space.
- embedded toolchains tend to be quite conservative and stick around for a while
Seems some remnants of those patches are still in the kernel (notably DISABLE_LTO so it doesn't use it for vdso), so I tried with 4.19.1. Formerly used scripts/gcc-ld but didn't work for me so I used gold. I doubt it's accomplishing anything built this way (size barely changed with other defaults). Despite using gcc-ar, was also complaining about the lto plugin unless -ffat-lto. Patchset used to use -fwhole-program too but that didn't work. Nonetheless, thought I'd do the crazy thing and build the kernel with:
make -j8 AR=gcc-ar NM=gcc-nm LD=ld.gold KCFLAGS="-march=native -O3 -falign-functions=32 -fipa-pta -fno-semantic-interposition -fgraphite-identity -floop-nest-optimize -flto=8 -ffat-lto-objects" DISABLE_LTO=-fno-lto
Which.. worked.. and booted fine. I am now the proud owner of a kernel that 30% bigger than before, probably not faster, and set out to kill my dog, but thankfully running in QEMU away from my dog. Edit: well, removing LTO with the same options does make it like 10% even bigger.
It might be interesting to compare the speed of some syscall- / kernel-bound workloads when successfully built with LTO. Anyone with an idea on how to start benchmarking our gains or losses?
Not sure, but if you check the kernel mailing list plenty of those benchmarks have been done in the past. I remember seeing pretty big gains with LTO, but not sure if those reflected into any gain for daily usage. Some more info about how to benchmark the kernel: https://github.com/graysky2/kernel_gcc_patch
One thing about LTO is you have to build as many of your models into the kernel as possible... so it knows what it can eliminate when linking... so you get the biggest gains on a completely static kernel (this of course breaks somethings that load firmware etc... some of that you can work around by building in the blobs though).
Andi Kleen rebased his LTO patches for the Kernel on 4.20 recently. I've tried it out but had no luck and several module errors along the way. Nevertheless, you can find these patches here: https://github.com/andikleen/linux-misc/tree/lto-420-1
^ Didn't experiment much but gave it a quick try and it built fine for me with my configuration and CONFIG_LTO=y
which auto-adds -flto -fno-fat-lto-objects
. Didn't try a generic one and I use almost no modules which, as stated in the other above post, is better suited for a LTO kernel anyway.
Looks like it's using the gcc-ld script and working properly. I do have gold as my default linker (been using it even for kernel).
I imagine it may make more of a difference on a less-lean kernel, but my resulting 4.20 kernel is about 1% smaller than my old, didn't try to boot and also no idea for any performance gains.
@ionenwks I'm trying to replicate the steps on a gentoo system to build an LTO'd kernel. However, I always error out on the linking portion: /usr/lib/gcc/x86_64-pc-linux-gnu/9.1.0/../../../../x86_64-pc-linux-gnu/bin/ld: error: arch/x86/kernel/head_64.o: requires unsupported dynamic reloc 11; recompile with -fPIC
I have added this flag to the base KBUILD_CFLAGS but to no effect. I also have ld.gold enabled by default.
What version of GCC and binutils are you using? Did you make any configurations to the Makefile from Andi Kleen's repo?
Hmm... I tried again both with the lto-420-1 branch from back then along with same configuration and the newer lto-5.1-3, and I'm getting the same errors as you now (using gcc 8.3.0 and ld.bfd 2.32).
Not sure what I was using back then but looking at the date I assume I was on gcc 8.2 and binutils 2.30 I think? It's only something I tried real quick, I had no intention to stick with that for now (or boot it).
Edit: Retried with gold as default (switched back to bfd a while ago), doesn't work either, not with current toolchain anyway. Edit2: And no, I hadn't made any changes, used as-is.
@ionenwks thank you for taking the time to check through the issue! I was afraid it was a toolchain version issue, so I wonder if this is a reportable bug? I'm going to take some time today and check if its a gcc or binutils issue. Edit: I'm throwing some more configuration testing into this mess. Found this article over on the patch list: https://patchwork.kernel.org/patch/10000627/
I was able to build 5.0-1 successfully, however I did not test it and the system it was on it now gone.
-fPIC
would cause reloc .text errors if it was built with visibility=hidden or ssp(but the Makefile already filters that). Maybe -flinker-output=rel would make sense here, but I couldn't get the syntax correct. ~because parts of the kernel build are still static, and static objects aren't able to find PIC references~. If anyone knows his full patchset without a kernel tree that'd be really helpful.
@jiblime You can find his patchset on the kernel mailing list but it won't really help: https://lkml.org/lkml/2017/11/27/1052
THIN_ARCHIVES
was a config option that was removed in 4.19+. It went around the supposed issue of ld -r
. But, I've narrowed it down to a ld
issue of some sort. There are kernel patches that let you fPIC the code but they aren't working for me yet.
@Promaethius Thanks for the link. I'm currently trying to edit arch/x86/entry/vdso/Makefile to work. At the very bottom you can try appending flags after ${LD} but nothing has worked for me, even the options to specifically suppress the error.
I went and checked a regular kernel and I noticed that it's normal(?) for a hidden symbol to be there.
Both comands ran were readelf vclock_gettime.o -s
5.1-3 LTO:
Symbol table '.symtab' contains 25 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FILE LOCAL DEFAULT ABS vclock_gettime.c
2: 0000000000000000 0 SECTION LOCAL DEFAULT 1
3: 0000000000000000 0 SECTION LOCAL DEFAULT 3
4: 0000000000000000 0 SECTION LOCAL DEFAULT 4
5: 0000000000000000 174 FUNC LOCAL DEFAULT 1 do_hres
6: 0000000000000000 0 SECTION LOCAL DEFAULT 5
7: 0000000000000000 0 SECTION LOCAL DEFAULT 7
8: 0000000000000000 0 SECTION LOCAL DEFAULT 8
9: 0000000000000000 0 SECTION LOCAL DEFAULT 10
10: 0000000000000000 0 SECTION LOCAL DEFAULT 11
11: 0000000000000000 0 SECTION LOCAL DEFAULT 12
12: 0000000000000000 0 SECTION LOCAL DEFAULT 14
13: 0000000000000000 0 SECTION LOCAL DEFAULT 15
14: 0000000000000000 0 SECTION LOCAL DEFAULT 17
15: 0000000000000000 0 SECTION LOCAL DEFAULT 19
16: 0000000000000000 0 SECTION LOCAL DEFAULT 20
17: 0000000000000000 0 SECTION LOCAL DEFAULT 18
18: 0000000000000000 0 NOTYPE GLOBAL HIDDEN UND vvar_vsyscall_gtod_data
19: 00000000000000b0 111 FUNC GLOBAL DEFAULT 1 __vdso_clock_gettime
20: 00000000000000b0 111 FUNC WEAK DEFAULT 1 clock_gettime
21: 0000000000000120 98 FUNC GLOBAL DEFAULT 1 __vdso_gettimeofday
22: 0000000000000120 98 FUNC WEAK DEFAULT 1 gettimeofday
23: 0000000000000190 16 FUNC GLOBAL DEFAULT 1 __vdso_time
24: 0000000000000190 16 FUNC WEAK DEFAULT 1 time
readelf: Warning: compressed section '.debug_str' is corrupted
5.2.8 kernel:
Symbol table '.symtab' contains 27 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FILE LOCAL DEFAULT ABS vclock_gettime.c
2: 0000000000000000 0 SECTION LOCAL DEFAULT 1
3: 0000000000000000 0 SECTION LOCAL DEFAULT 3
4: 0000000000000000 0 SECTION LOCAL DEFAULT 4
5: 0000000000000000 392 FUNC LOCAL DEFAULT 1 do_hres
6: 0000000000000000 0 SECTION LOCAL DEFAULT 5
7: 0000000000000000 0 SECTION LOCAL DEFAULT 7
8: 0000000000000000 0 SECTION LOCAL DEFAULT 8
9: 0000000000000000 0 SECTION LOCAL DEFAULT 10
10: 0000000000000000 0 SECTION LOCAL DEFAULT 11
11: 0000000000000000 0 SECTION LOCAL DEFAULT 12
12: 0000000000000000 0 SECTION LOCAL DEFAULT 14
13: 0000000000000000 0 SECTION LOCAL DEFAULT 15
14: 0000000000000000 0 SECTION LOCAL DEFAULT 17
15: 0000000000000000 0 SECTION LOCAL DEFAULT 19
16: 0000000000000000 0 SECTION LOCAL DEFAULT 20
17: 0000000000000000 0 SECTION LOCAL DEFAULT 18
18: 0000000000000000 0 NOTYPE GLOBAL HIDDEN UND vvar_vsyscall_gtod_data
19: 0000000000000000 0 NOTYPE GLOBAL HIDDEN UND hvclock_page
20: 0000000000000000 0 NOTYPE GLOBAL HIDDEN UND pvclock_page
21: 0000000000000190 102 FUNC GLOBAL DEFAULT 1 __vdso_clock_gettime
22: 0000000000000190 102 FUNC WEAK DEFAULT 1 clock_gettime
23: 0000000000000200 98 FUNC GLOBAL DEFAULT 1 __vdso_gettimeofday
24: 0000000000000200 98 FUNC WEAK DEFAULT 1 gettimeofday
25: 0000000000000270 16 FUNC GLOBAL DEFAULT 1 __vdso_time
26: 0000000000000270 16 FUNC WEAK DEFAULT 1 time
readelf: Warning: compressed section '.debug_str' is corrupted
looks to be of interest. Does this mean there needs to be more debug information built in?
@jiblime I found this on the gcc site today: https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-fuse-linker-plugin-916
When a file is compiled with -flto without -fuse-linker-plugin, the generated object file is larger than a regular object file because it contains GIMPLE bytecodes and the usual final code (see -ffat-lto-objects. This means that object files with LTO information can be linked as normal object files; if -fno-lto is passed to the linker, no interprocedural optimizations are applied. Note that when -fno-fat-lto-objects is enabled the compile stage is faster but you cannot perform a regular, non-LTO link on them.
I've witnessed Andi Kleen's patchset passing -fno-fat-lto-objects
without -fuse-linker-plugin
. Will test this theory later today. This could explain why readelf is returning corruption, but pardon my ignorance if that's not the case.
@Promaethius
I've witnessed Andi Kleen's patchset passing -fno-fat-lto-objects without -fuse-linker-plugin
That explains why he uses -fwhole-program and and -fipa-cp-clone, since collect2 would be used instead of a linker. I'm assuming he's doing that for compatibility, as GCC documentation claims it's likely to increase code size vs. bfd/gold. I wonder if GentooLTO would be able to do something better...
I believe it's a glibc issue. I've upgraded to sys-libs/glibc-2.30::gentoo and have been able to get past it. Currently recompiling since paravirtualization options, not sure which, causes it to error.
https://sourceware.org/ml/libc-alpha/2019-08/msg00029.html
* The dynamic linker no longer refuses to load objects which reference
versioned symbols whose implementation has moved to a different soname
since the object has been linked. The old error message, symbol
FUNCTION-NAME, version SYMBOL-VERSION not defined in file DSO-NAME with
link time reference, is gone.
It emits a warning, I'm still not sure why since Andi Kleen filters LTO out of it from what I can tell.
Warnings emitted with V=2
CC arch/x86/entry/vdso/vdso32-setup.o - due to target missing
LDS arch/x86/entry/vdso/vdso.lds - due to target missing
AS arch/x86/entry/vdso/vdso-note.o - due to target missing
CC arch/x86/entry/vdso/vclock_gettime.o - due to target missing
In file included from ./arch/x86/include/asm/vgtod.h:5,
from arch/x86/entry/vdso/vclock_gettime.c:15:
arch/x86/entry/vdso/vclock_gettime.c: In function ‘do_hres’:
./include/linux/compiler.h:182:26: warning: array subscript 1 is outside array bounds of ‘u8[1]’ {aka ‘unsigned char[1]’} [-Warray-bounds]
182 | case 8: *(__u64 *)res = *(volatile __u64 *)p; break;
| ^~~~~~~~~~~~~~~~~~~~
./include/linux/compiler.h:193:2: note: in expansion of macro ‘__READ_ONCE_SIZE’
193 | __READ_ONCE_SIZE;
| ^~~~~~~~~~~~~~~~
arch/x86/entry/vdso/vclock_gettime.c:37:11: note: while referencing ‘hvclock_page’
37 | extern u8 hvclock_page
| ^~~~~~~~~~~~
In file included from ./arch/x86/include/asm/vgtod.h:5,
from arch/x86/entry/vdso/vclock_gettime.c:15:
./include/linux/compiler.h:182:26: warning: array subscript 2 is outside array bounds of ‘u8[1]’ {aka ‘unsigned char[1]’} [-Warray-bounds]
182 | case 8: *(__u64 *)res = *(volatile __u64 *)p; break;
| ^~~~~~~~~~~~~~~~~~~~
./include/linux/compiler.h:193:2: note: in expansion of macro ‘__READ_ONCE_SIZE’
193 | __READ_ONCE_SIZE;
| ^~~~~~~~~~~~~~~~
arch/x86/entry/vdso/vclock_gettime.c:37:11: note: while referencing ‘hvclock_page’
37 | extern u8 hvclock_page
| ^~~~~~~~~~~~
CC arch/x86/entry/vdso/vgetcpu.o - due to target missing
VDSO arch/x86/entry/vdso/vdso64.so.dbg - due to target missing
OBJCOPY arch/x86/entry/vdso/vdso64.so - due to target missing
HOSTCC arch/x86/entry/vdso/vdso2c - due to target missing
VDSO2C arch/x86/entry/vdso/vdso-image-64.c - due to target missing
CC arch/x86/entry/vdso/vdso-image-64.o - due to target missing
LDS arch/x86/entry/vdso/vdso32/vdso32.lds - due to target missing
CC arch/x86/entry/vdso/vdso32/vclock_gettime.o - due to target missing
In file included from ./arch/x86/include/asm/vgtod.h:5,
from arch/x86/entry/vdso/vdso32/../vclock_gettime.c:15,
from arch/x86/entry/vdso/vdso32/vclock_gettime.c:31:
arch/x86/entry/vdso/vdso32/../vclock_gettime.c: In function ‘do_hres’:
./include/linux/compiler.h:182:26: warning: array subscript 1 is outside array bounds of ‘u8[1]’ {aka ‘unsigned char[1]’} [-Warray-bounds]
182 | case 8: *(__u64 *)res = *(volatile __u64 *)p; break;
| ^~~~~~~~~~~~~~~~~~~~
./include/linux/compiler.h:193:2: note: in expansion of macro ‘__READ_ONCE_SIZE’
193 | __READ_ONCE_SIZE;
| ^~~~~~~~~~~~~~~~
In file included from arch/x86/entry/vdso/vdso32/vclock_gettime.c:31:
arch/x86/entry/vdso/vdso32/../vclock_gettime.c:37:11: note: while referencing ‘hvclock_page’
37 | extern u8 hvclock_page
| ^~~~~~~~~~~~
In file included from ./arch/x86/include/asm/vgtod.h:5,
from arch/x86/entry/vdso/vdso32/../vclock_gettime.c:15,
from arch/x86/entry/vdso/vdso32/vclock_gettime.c:31:
./include/linux/compiler.h:182:26: warning: array subscript 2 is outside array bounds of ‘u8[1]’ {aka ‘unsigned char[1]’} [-Warray-bounds]
182 | case 8: *(__u64 *)res = *(volatile __u64 *)p; break;
| ^~~~~~~~~~~~~~~~~~~~
./include/linux/compiler.h:193:2: note: in expansion of macro ‘__READ_ONCE_SIZE’
193 | __READ_ONCE_SIZE;
| ^~~~~~~~~~~~~~~~
In file included from arch/x86/entry/vdso/vdso32/vclock_gettime.c:31:
arch/x86/entry/vdso/vdso32/../vclock_gettime.c:37:11: note: while referencing ‘hvclock_page’
37 | extern u8 hvclock_page
| ^~~~~~~~~~~~
AS arch/x86/entry/vdso/vdso32/note.o - due to target missing
AS arch/x86/entry/vdso/vdso32/system_call.o - due to target missing
AS arch/x86/entry/vdso/vdso32/sigreturn.o - due to target missing
VDSO arch/x86/entry/vdso/vdso32.so.dbg - due to target missing
OBJCOPY arch/x86/entry/vdso/vdso32.so - due to target missing
VDSO2C arch/x86/entry/vdso/vdso-image-32.c - due to target missing
CC arch/x86/entry/vdso/vdso-image-32.o - due to target missing
So as I understand, it would be a huge issue to have a textrel in a/the vdso because it'd be a vulnerability in a security feature. Gentoo's wiki actually has a guide on finding and fixing textrels: https://wiki.gentoo.org/wiki/Hardened/Textrels_Guide
But hopefully there's no need to recreate anything. While the vdso*.so files have a textrel flag marked on them, scanelf -T
shows that there isn't anything that would point to it.
Glibc 2.29, GCC 9.1.0
TYPE PAX PERM ENDIAN STK/REL/PTL TEXTREL RPATH BIND TEXTRELS FILE
scanelf: scanelf_file_textrels(): ELF is missing relocation information
scanelf: scanelf_file_textrels(): ELF vdso32.so has TEXTREL markings but doesnt appear to have any real TEXTREL's !?
ET_DYN PeMRxS 0755 LE --- --- R-X TEXTREL - LAZY vdso32.so
It did also emit this, though:
arch/x86/kernel/dumpstack.o: warning: objtool: show_regs.cold()+0x16: sibling call from callable instruction with modified stack frame
arch/x86/kernel/dumpstack.o: warning: objtool: show_regs()+0x0: stack state mismatch: cfa1=7+24 cfa2=7+8
So it looks like it can be possible, but definitely experimental and not a daily driver for myself. I'm going to be grabbing GCC 9.2 now so I won't be getting to it anytime soon (btw, I added 20G of swap with -j5 and it still failed, dammit), but if Glibc 2.30 is the fix, I think it'd be worth a shot to try using this kernel for testing.
If you were to use a linker instead of collect2 you can run replace -fwhole-program
with fuse-linker-plugin
in scripts/Makefile.lto as Gnu documentation states it's best not to use the former with the latter. Optimizations that would also help LTO specifically would be -fdevirtualize-at-ltrans
and the -fgraphite-identity -floop-nest-optimize
options. I've used these along other flags to compile and run my kernel, but if the linking stage is too much the process will overflow and it'll end.
What's interesting is that his newest version (as far as I can tell) lacks explicit linker usage but his older versions use -fuse-linker-plugin
. So I could be wrong in assuming that removing -fwhole-program
is the right way to go.
Andi Kleen's lto-5.7-2 branch branch builds and I am currently running it. I've applied the 5.7.14 patch, Gentoo distro patches, and a few other misc. patches with no rejects.
Notes:
-
nouveau does not build due to the command to building it being too long for shell
-
I wasn't able to load any modules, not when booted in or through my initrd. Used
make mod2yesconfig
to convert modules to being built in. [Correction 1 below] -
./scripts/Makefile.lto contains the settings for how LTO is done upon the kernel. I appended
-flto-compression-level=9
toLTO_CFLAGS
-
In the same file, TMPDIR sets the building directory in the kernel directory instead of /tmp to prevent OOM. I copied the kernel to /var/tmp/portage instead because there are massive writes on disk during linking.
-
The primary demographic of LTO'd kernels seems to be embedded systems
The size of my LTO'd kernel is 22M, modules folder is 800K. Vs. my normal kernel at 11M and modules folder at 71M
- It feels fast, that counts
Semi-related:
GCC 10's -O2 might be slightly slower than GCC 9's -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96337#c15
Inliner changes was not targetting to make compile time faster and
compiled code slower. It was intended to reflect more closely modern C++
codebases and get faster binaries (at -O2 and -O2 -flto) without
regressing in code sizes. In fact more inlining happens and thus we
needed to optimize inliner code carefully to avoid regressions with LTO.
If you have a -march=znver1/znver2 processor and run x86_64 multilib, rebuilding the current GCC 10.2.0 would mean a nice performance boost with this patch:
Refer to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95435
Correction 1: I incorrectly assumed modules weren't supported with -flto
. While building everything into the kernel alleviated the issues, namely framebuffer and Logitech USB support, kernel compilation time was too long and I prefer being able to reload modules. The likely culprit in module failure was TRIM_UNUSED_KSYMS
and possibly dracut defaulting to --strip
the generated initrd; can't say for certain yet. I didn't get around to testing it enough but now I am able to load amdgpu
in my initrd as usual instead of compiling it in.
* It feels fast, that counts
Can you describe in what way?
Cheers for the gcc links too
oooh, imma test
@jiblime Could you list the patches applied? All are from gentoo's ebuild?
@jiblime Could you list the patches applied? All are from gentoo's ebuild?
I haven't built it yet, but this patch applies fine to gentoo-sources-5.8 (just a diff from the lto-5.8.0-1 branch)
https://gist.github.com/telans/728b63dd07c41c9ca6e2ca3d4431db8e
Doesn't build for me unfortunately, lots of:
/usr/lib/gcc/x86_64-pc-linux-gnu/10.2.0/../../../../x86_64-pc-linux-gnu/bin/ld: ./.tmp_vmlinux.kallsyms1.mJXteD.ltrans123.ltrans.o: relocation R_X86_64_32S against `.data' can not be used when making a PIE object; recompile with -fPIE
also, there's no 5.7.14 patch
https://cdn.kernel.org/pub/linux/kernel/v5.x/patch-5.7.14.xz
that patch is already applied... nvm, messed up something, ended with upstream master somehow... this will impact my sdd most def
@telans No problem. The first thing I noticed was my dmesg timestamps were lower than usual :p ideally I'll set up a phoronix benchmark to have actual data.
relocation R_X86_64_32S
Are you using ld.gold as your default linker? The Linux kernel needs either GCC/ld.bfd or Clang/ld.lld. https://github.com/InBetweenNames/gentooLTO/issues/338
sys-devel/gcc-10.2.0::gentoo was built with the following:
USE="(cxx) fortran graphite lto (multilib) nls nptl objc openmp pch pgo sanitize ssp zstd (-ada) -d -debug -doc (-fixed-point) -go (-hardened) -jit (-libssp) -objc++ -objc-gc -pie -systemtap -test -vanilla -vtv" ABI_X86="(64)"
sys-devel/binutils-2.34-r2::gentoo was built with the following:
USE="gold multitarget nls plugins static-libs -default-gold -doc -test" ABI_X86="(64)"
@barolo https://github.com/jiblime/linux-misc/commits/lto-5.7-prjc-r3 You can pull the patches from here or clone the single branch and build off that. The CPPC patch doesn't work for me, so I leave it off just in case it would case me to fail to boot. It's a bit messy, I'm still not the greatest at making clean commits. I chose the 5.7-2 branch instead of 5.8 because I wanted to try the Project C scheduler (previously named BMQ, now abbreviated prjc). I'll try the 5.8 branch sometime.
I generally download a vanilla tarball from kernel.org (v5.7, v5.8, etc) and apply the Gentoo patches and incremental patches afterwards. That way I don't have to worry about rejected patches as often
@jiblime thanks for the branch, made it much easier for me. Compiling
compiled almost cleanly for me, didn't take that long too, had a bunch of "-Wstringop-overflow" warnings for Bluetooth module. Didn't boot for me with error related to scsi. With modules builtin it is 20M , modules dir i 1M I have nvme and amdgpu on that box, gonna try to strip it a bit more
Narrowed it down, hidpp/logitech's stuff makes it crash, and it doesn't switch to amdgpu output @jiblime it seems like you\ve had similar issues, how did you solve them? Edit. Cleaned it a bit, built amdgpu, bluetooth, and logitech hidpp as modules, the remaining issue seems to be that framebuffer isn't being switched during boot
Are you using ld.gold as your default linker? The Linux kernel needs either GCC/ld.bfd or Clang/ld.lld.
Nope, using ld.bfd ( or at least I haven't changed it.)
sys-devel/gcc-10.2.0::gentoo was built with the following:
USE="(cxx) fortran graphite lto (multilib) nls nptl openmp pch pgo (pie) sanitize ssp vtv zstd (-ada) -d -debug -doc (-fixed-point) -go (-hardened) (-jit) (-libssp) -objc -objc++ -objc-gc -systemtap -test -vanilla" ABI_X86="(64)"
sys-devel/binutils-2.34-r2::gentoo was built with the following:
USE="gold nls plugins -default-gold -doc -multitarget -static-libs -test" ABI_X86="(64)"
Forcing LD=ld.bfd
doesn't change anything either. I thought it might have been an issue with ripping a patch from the lto-5.8-1
branch, however, the branch too builds with the same relocation errors
Same issue with lto-5.7-2
Update, managed to run it and reach the desktop. The issue was with building all modules in.
So I took my working config as base, used genkernel and made sure that it runs without LTO enabled first, then enabled LTO and booted into desktop successfully.
Ended with a bunch of drivers disabled, most importantly for network and sata, luckily my main is a pcie one.
Each failed module had disagrees about version of symbol module_layout
in dmesg, gonna investigate it now.
Edit. It seems that all of those are modules that weren't built in, so it seems that initramfs isn't working for me
Edit2. I'm typing from it, had to recompile it cleanly, cleaned it a bit and built some stuff in, module loading doesn't seem to work as I still got two of those disagrees...
warnings
Can't really compare it yet, since it seems to use diff schedulers than I had with zen kernel, and spends more time at lower frequencies, would have to bench it properly to test it seriously.
I can already tell though that building that kernel is significantly faster under it
My gut tells me it has something to do with the -fPIE flag
On Sun, Aug 9, 2020, 3:27 AM Greg Shuiske [email protected] wrote:
Update, managed to run it and reach the desktop. The issue was with building all modules in. So I took my working config as base, used genkernel and made sure that it runs without LTO enabled first, then enabled LTO. Ended with a bunch of drivers disabled, most importantly for network and sata, luckily my main is a pcie one. Each module had disagrees about version of symbol module_layout in dmesg, gonna investigate it now.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/InBetweenNames/gentooLTO/issues/90#issuecomment-671029236, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFFNN3MKRMCYOAXWIEI7BLR7ZTWVANCNFSM4EN5L3PQ .
@Promaethius I've solved that by having those with warnings changed to built-in, It's running fine so far, gonna bench it with something now. My whole kernel with inbuilt stuff is 10 MB, with useless 4MB initramfs, for gaming desktop