syzkaller icon indicating copy to clipboard operation
syzkaller copied to clipboard

support CONFIG_RANDOMIZE_BASE=y

Open jiangenj opened this issue 1 year ago • 8 comments

The current implementation doesn't work well when CONFIG_RANDOMIZE_BASE enabled.

Taken some arm64 devices for example: kaslr_offset is diff at bits 12-40, and kernel modules are loaded at 2GB space, so we have ffffffd342e10000 T _stext where uppper 32bit is ffffffd3. However, if we check modules range, the 1st module is loaded at 0xffffffd2eeb2a000, while the last module is loaded at 0xffffffd2f42c4000. We can see the upper 32bits are diff for core kernel and modules.

If we use current 32bits for covered PC, we will get wrong module address recovered, which uses pcBase from core kernel + lower 32bits offset.

So we need to move to 64bit cover and signal.

Besides, there are some other fixes and improvement in the PR to align with the 64bits support:

  • Fix wrong module size which should be .text size instead of that from /proc/modules.
  • Calculate kaslr_offset by subtracting _stext address on device and _stext from elf file.
  • Uniform Module and KernelModule which contains Name, Addr, Path, Size.

jiangenj avatar May 22 '24 01:05 jiangenj

Codecov Report

Attention: Patch coverage is 14.93213% with 188 lines in your changes missing coverage. Please review.

Project coverage is 61.0%. Comparing base (757f06b) to head (2c57f85).

:exclamation: Current head 2c57f85 differs from pull request most recent head 7c70de3

Please upload reports for the commit 7c70de3 to get more accurate results.

Additional details and impacted files
Files Coverage Δ
pkg/cover/canonicalizer.go 93.7% <100.0%> (-3.9%) :arrow_down:
syz-manager/covfilter.go 23.6% <ø> (+23.6%) :arrow_up:
pkg/cover/backend/gvisor.go 18.0% <0.0%> (ø)
pkg/vminfo/netbsd.go 50.0% <0.0%> (ø)
pkg/vminfo/openbsd.go 42.9% <0.0%> (ø)
pkg/vminfo/vminfo.go 74.4% <50.0%> (ø)
syz-manager/cover.go 0.0% <0.0%> (ø)
pkg/cover/report.go 81.8% <50.0%> (-1.0%) :arrow_down:
pkg/cover/backend/mach-o.go 0.0% <0.0%> (ø)
pkg/cover/backend/backend.go 0.0% <0.0%> (ø)
... and 5 more

... and 54 files with indirect coverage changes

codecov[bot] avatar May 22 '24 06:05 codecov[bot]

CC @eprucka3 @kalder since this touches Android modules support.

dvyukov avatar May 31 '24 08:05 dvyukov

CONFIG_RANDOMIZE_BASE does not work for kernel modules since we subtract vmlinux's kaslr_offset() from modules as well, right?

  • Always use KernelModule instead of ptr.

Why do we need this? This will increase size of all these objects that embed KernelModule by value.

dvyukov avatar May 31 '24 08:05 dvyukov

CONFIG_RANDOMIZE_BASE does not work for kernel modules since we subtract vmlinux's kaslr_offset() from modules as well, right?

There was a recent change to getModuleTextAddr, so the modules offsets are pulled directly from /sys/module which already includes the offset. This change included removing the kaslr offset in discoverModulesLinux.

eprucka3 avatar May 31 '24 15:05 eprucka3

CONFIG_RANDOMIZE_BASE does not work for kernel modules since we subtract vmlinux's kaslr_offset() from modules as well, right?

  • Always use KernelModule instead of ptr.

Why do we need this? This will increase size of all these objects that embed KernelModule by value.

Fixed to use ptr only

jiangenj avatar Jun 04 '24 02:06 jiangenj

CONFIG_RANDOMIZE_BASE does not work for kernel modules since we subtract vmlinux's kaslr_offset() from modules as well, right?

There was a recent change to getModuleTextAddr, so the modules offsets are pulled directly from /sys/module which already includes the offset. This change included removing the kaslr offset in discoverModulesLinux.

Right, there are two kinds of offset for modules:

  1. for module loaded into randomized address after each reboot, which was already supported in canonicalizer.go.
  2. kaslr offset introduced by CONFIG_RANDOMIZE_BASE, which is presented in modules address in /proc/modules. The PR is to support removing the kaslr_offset from module address.

jiangenj avatar Jun 04 '24 02:06 jiangenj

CONFIG_RANDOMIZE_BASE does not work for kernel modules since we subtract vmlinux's kaslr_offset() from modules as well, right?

There was a recent change to getModuleTextAddr, so the modules offsets are pulled directly from /sys/module which already includes the offset. This change included removing the kaslr offset in discoverModulesLinux.

Right, there are two kinds of offset for modules:

  1. for module loaded into randomized address after each reboot, which was already supported in canonicalizer.go.
  2. kaslr offset introduced by CONFIG_RANDOMIZE_BASE, which is presented in modules address in /proc/modules. The PR is to support removing the kaslr_offset from module address.

Do we remove kaslr offset from module address twice now? How does it manifest? Do we have cove coverage reports for modules? Do they work?

Overall it's hard to move forward with you PRs because they mix lots of different fixes, large refactorings, and performance optimizations, and individual commit don't have any explanation.

For example this one: https://github.com/google/syzkaller/pull/4828/commits/3b985ce5ea82d148868b804e4798ed7c9da899e9 if you send it in a separate PR, we can merge right away and remove it from the plate.

This one: https://github.com/google/syzkaller/pull/4828/commits/64f620866fcb76e44163a6c58c6021251b3c6534 It would be good to understand what's calling it repeatedly. Perhaps something should be fixed at the caller side. Currently it's wrong, if I call the function second time with different dirs, it will return the old result, also error is not cached. But generally such caching at the lowest level just tries to work around some architectural problems on higher levels. But in this large PR all of this is lost among of other changes.

This commit: https://github.com/google/syzkaller/pull/4828/commits/fd2a12716901848b7d4ed8f524bceecbf3b70f26 Seems to be fixing something, but I can't understand what exactly. If you send this fix as a separate PR with own explanation of the problem, and ideally a test (so that we don't break it tomorrow), then we could also merge it separately.

The KernelModule pointer refactorting also seems to be unrelated to the rest, I can't understand reasons behind it, and it just makes progressing with the rest of PR more difficult.

dvyukov avatar Jun 10 '24 08:06 dvyukov

let me see if I can split it into several smaller PRs.

jiangenj avatar Jun 11 '24 02:06 jiangenj

@dvyukov updated, pls check again.

jiangenj avatar Jul 01 '24 05:07 jiangenj

All gvisor instances started failing with:

failed to create rpc server: no symbol section

dvyukov avatar Jul 03 '24 09:07 dvyukov

Sent https://github.com/google/syzkaller/pull/4974

dvyukov avatar Jul 03 '24 09:07 dvyukov