Built-in on GCC 11.1 leaving orphan sections
When building in the kernel tree, not as a module, on GCC 11.1.0, this warning is thrown at the end of the compilation process:
LD .tmp_vmlinux.kallsyms1'
ld: warning: orphan section `.p_lkrg_read_only' from `security/lkrg/p_lkrg_main.o' being placed in section `.p_lkrg_read_only'
KSYMS .tmp_vmlinux.kallsyms1.S
The tree in which this is happening is @ 5.10.96 with the linux-hardened patches (not adding major GCC plugins or anything like that).
Hmm, something's not right here. On the subsequent 5.10.100, built as a module, and loaded at runtime, i get this fine mess:
# modprobe p_lkrg
modprobe: ERROR: could not insert 'p_lkrg': No buffer space available
and in dmesg:
[ 6474.831003] [p_lkrg] Loading LKRG...
[ 6474.831620] [p_lkrg] System does NOT support SMEP. LKRG can't enforce SMEP validation :(
[ 6474.832685] [p_lkrg] System does NOT support SMAP. LKRG can't enforce SMAP validation :(
[ 6474.836621] Freezing user space processes ... (elapsed 0.001 seconds) done.
[ 6474.837918] OOM killer disabled.
[ 6474.850588] [p_lkrg] [ED] ERROR: Can't find 'put_seccomp_filter' function :( Exiting...
[ 6474.851636] [p_lkrg] Can't initialize exploit detection features! Exiting...
[ 6474.916225] OOM killer enabled.
[ 6474.916226] Restarting tasks ... done.
# uname -r
5.10.100
I don't see anything wrong with this:
ld: warning: orphan section `.p_lkrg_read_only' from `security/lkrg/p_lkrg_main.o' being placed in section `.p_lkrg_read_only'
LKRG does created a new section called .p_lkrg_read_only which probably matches the attribute of the kernel's .rodata section and this warning is printed.
The problem with initialization is related to:
[p_lkrg] [ED] ERROR: Can't find 'put_seccomp_filter' function :( Exiting...
which is exactly the same as https://github.com/lkrg-org/lkrg/issues/135. Please read the fix for that issue in my comment: https://github.com/lkrg-org/lkrg/issues/135#issuecomment-1018693257
Hey @Adam-pi3 - thanks for jumping in. The EXPORT_SYMBOL(__put_seccomp_filter); suggestion is a neat trick, thanks, will try that.
I'm hoping it will fix both the module-based and in-binary execution woes. Fun of VMWae to remove all of the handy CPU features for defensive tooling though (i know the actual hardware under that thing has SMAP/SMEP).
@sempervictus What "in-binary execution woes" are you referring to?
@solardiz: The inability for p_lkrg to run without exporting a bunch of symbols (which i'm guessing per @Adam-pi3's comment are being eaten up by GCC for those of us using -O3). I'm trimming back the exported symbols now to figure out the minimum required set in this setup and will try to push 'em up as a diff for in-tree compilation (since the DKMS method has no way to expose these from the running kernel AFAIK).
This whole thing does make me wonder if the inability of LKRG to resolve symbols at runtime transfers to other rootkits - does O3's optimization pass actually break runtime hooking by accident for things we dont want running in ring0? :smile:
I'm not a compiler genius, know some of the basics, but definitely don't speak Florian's or Brad's GCCnglish. That said, it does make me wonder whether it makes sense to have the relevant symbols resolved at compile-time when building LKRG as part of a kernel instead of having it try to find them at runtime through hooks and tracing. How bad would the overhead (effort-wise) be to do that, and how bad would be maintenance-wise? Might also be a step toward upstream adoption (and then all of the custom kernel patches those of us griping about this use would have to adapt to LKRG vs the other way around).
I don't see anything wrong with this
@Adam-pi3 While it's just a warning, somehow upstream Linux recently decided to explicitly enable these warnings - perhaps they had their reasons:
# We never want expected sections to be placed heuristically by the
# linker. All sections should be explicitly named in the linker script.
ifdef CONFIG_LD_ORPHAN_WARN
LDFLAGS_vmlinux += --orphan-handling=warn
endif
config LD_ORPHAN_WARN
def_bool y
depends on ARCH_WANT_LD_ORPHAN_WARN
depends on !LD_IS_LLD || LLD_VERSION >= 110000
depends on $(ld-option,--orphan-handling=warn)
ARCH_WANT_LD_ORPHAN_WARN is forcibly defined for a number of archs, including x86.
Ideally, we'd follow their approach and address the warnings. Edit: this probably means that if we want to keep the section, then we need to add a linker script.
I'm not sure I recall why we even have that section - perhaps it was for page alignment (we were unsure the kernel's existing sections are page-aligned?), but then we currently don't explicitly page-align our section anyway:
//#define p_lkrg_read_only __attribute__((__section__(".data..p_lkrg_read_only"),aligned(PAGE_SIZE)))
#define __p_lkrg_read_only __attribute__((__section__(".p_lkrg_read_only")))
So maybe we should try dropping it?
@sempervictus Yes, those other issues and thoughts keep coming up, but are off-topic for this GitHub issue, which is already a mess. ;-)
@solardiz - sorry for the pollution. I exist in a bit of a chainsaw-juggling paradigm (multiple tasks at multiple consultancies and some c-type roles which often digress from tech) so i try to get whatever's in the braincase out to persistent storage before the next context switch points the neural net at a completely different problem domain. Will try to do better and open separate issues.
I'm happy to test any proposed fix while i'm on this task though - my non-grsec builds are full hardware kernels, so take a bit to compile, but i can usually get one built and packaged/tested in under an hour.
@Adam-pi3 - the EXPORT_SYMBOL trick seems to work, thank you. I ran into a few other symbols it wanted and did a fairly overkill pass at exporting symbols from seccomp, but it does load now:
[31884.704044] [p_lkrg] Loading LKRG...
[31884.709494] Freezing user space processes ... (elapsed 0.014 seconds) done.
[31884.724308] OOM killer disabled.
[31884.729610] [p_lkrg] LKRG can't enforce SELinux validation (CONFIG_GCC_PLUGIN_RANDSTRUCT detected)
[31884.946777] [p_lkrg] [kretprobe] register_kretprobe() for <ovl_create_or_link> failed! [err=-2]
[31884.946829] [p_lkrg] Can't hook 'ovl_create_or_link' function. This is expected if you are not using OverlayFS.
[31885.043235] [p_lkrg] [kretprobe] register_kretprobe() for <lookup_fast> failed! [err=-2]
[31885.043282] [p_lkrg] LKRG won't enforce pCFI validation on 'lookup_fast'
[31885.301648] [p_lkrg] LKRG initialized successfully!
[31885.301682] OOM killer enabled.
[31885.301683] Restarting tasks ... done.
Testing a reduced version of the symbol export patch shortly.
Confirm that #161 works in local testing (5.15 kvm/grsec), in our OpenStack (5.10 kvm/grsec Nova Compute nodes), in AWS on a t2.large, but still need to check that original vmware target which reported having pretty much no CPU features exposed. Since the failure to start was caused by the symbol lookup failure, i think we can close this and focus on redress in the PR.
Hey, this issue was about the linker warning that is still unfixed, and we still haven't decided on fixing it or not. That you added off-topic comments to it doesn't mean you should close it because of those now. ;-)
Sorry, i closed the issue since the symbol exports actually allowed it to load, but the linker warning persists and i should've left it open.
@solardiz I'm not sure there is anything what we can do with it. At least if we don't provide custom linker script: https://stackoverflow.com/questions/49095127/gnu-linker-orphan-sections-and-symbol-assignment
@Adam-pi3 Off the top of my head, our options might be:
- Do nothing. (Accept that a warning is printed during LKRG build on recent systems.)
- Add a linker script. (Looks like upstream Linux wants us to, if we define a new section.)
- Silence the warning. (Can we override upstream Linux Makefile's linker option that enabled the warning in our Makefile?)
- Don't define a new section. (Why do we, anyway? If we don't explicitly specify section alignment, then perhaps an existing section is page-aligned or not just as well, so our page-alignment of a symbol within that section will work or fail just as well.)
I think it's most reasonable to research option 4 first, as it would simplify LKRG, whereas options 2 or 3 would complicate it.
@solardiz We do have own section which we keep as a read-only where we keep all of the configuration knobs and dynamically resolves symbols / pointers. We are aligning it to the PAGE_SIZE and we place a marker page before and after our critical data. We especially created it for security. From my perspective option #1, #2 and #3 sounds good.