kedr Patch KEDR for linux <= 5.11.0

Hey there,

I'm the same guy from #49! We're still interested in using KEDR for our operating systems class (specifically, for Debian 11).

This is only a draft PR though, enough to get KEDR working again for simple use cases. I'd be happy to iterate on this so we can get the test suite and the docs up to date, too.

Some general notes:

Resolved #50 by making __kerealloc optional. To my understanding, KEDR doesn't actually use __kerealloc, but does in theory support it being interecepted?
kallsyms_lookup_name is no longer exported in Linux, following: https://lwn.net/Articles/813350/. Here is a workaround that I found that doesn't require recompiling: https://github.com/xcellerator/linux_kernel_hacking/issues/3 (at least for Debian 11)
Fixed a few minor function signature changes
Here is an issue we'll have to resolve eventually. sources/core/kedr_target_detector.c uses module_mutex and find_module() to see if a given module is already running. Unfortunately, they are made static and unexported respectively in upcoming versions of linux. It's fine for <= ~5.10.57~ 5.11.0 though.

This PR also breaks backwards compatibility with earlier versions of linux. If this is something you care about, we can definitely macro-protect some of these changes. Let me know what you think @euspectre !

Aug 27 '21 00:08 hmontero1205

First of all, thanks for looking into this.

Could you please split the patch into a few parts, it would be easier to review:

Workaround for unexported kallsyms_lookup_name. It would be cleaner if we put it into a separate function rather than into prepare_set_memory_rx_funcs() directly. It is an interesting idea to use kprobes to get its address, it should work for older kernels too. Error handling for register_kprobe is needed though. I guess, upstream kernel developers are likely to unexport register_kprobe in the future too (if this hack becomes widespread), but workarounds are possible. E.g. we could pass the address of kallsyms_lookup_name to KEDR core from the user space if that happens.
Making __krealloc optional. IIRC, "OPTIONAL" keyword marks all the functions following it as optional, not only __krealloc, so it is better to move __krealloc after some existing OPTIONAL keyword.

On the other hand, we could probably mark all functions optional, I think, and just take what the kernel has. The distinction OPTIONAL VS REQUIRED has been artificial for quite some time, IMHO.

"minor function signature changes" It is about __vmalloc and pgprot_t prot parameter, right? I guess, this is the part that breaks backward compatibility. We could provide __vmalloc.data files both for the old and the new variants and detect the signatures at the build stage of KEDR to select the correct file, see https://github.com/euspectre/kedr/commit/9f8efbe92a56f7621fa0cba5b5ff73b6586863d3, for example.

KEDR is still used for older kernels like 3.10.0-x from RHEL7 and such, so backward compatibility is important.

Other changes, like replacing pr_warning with pr_warn, etc..

Here is an issue we'll have to resolve eventually. sources/core/kedr_target_detector.c uses module_mutex and find_module() to see if a given module is already running. Unfortunately, they are made static and unexported respectively in upcoming versions of linux. It's fine for <= 5.10.57 5.11.0 though.

If we have a working kallsyms_lookup_name function and the kernel is built with CONFIG_KALLSYMS_ALL=y we could just use it to find the addresses of module_mutex and find_module(). Again, probably with checking of the kernel version or availablility of these symbols at KEDR's build stage.

Aug 27 '21 16:08 euspectre

If we have a working kallsyms_lookup_name function and the kernel is built with CONFIG_KALLSYMS_ALL=y we could just use it to find the addresses of module_mutex and find_module(). Again, probably with checking of the kernel version or availablility of these symbols at KEDR's build stage.

@euspectre Is this something you think we should tackle now or later? I'm pretty sure current releases of major linux distributions don't have kernels new enough where that matters.

Aug 27 '21 17:08 hmontero1205

Is this something you think we should tackle now or later? I'm pretty sure current releases of major linux distributions don't have kernels new enough where that matters.

Later, in a separate patch. For now, supporting kernels up to 5.11 should be enough.

Aug 27 '21 17:08 euspectre

@euspectre please see 1da05d9 and let me know if there is a more idiomatic way to handle function renaming across linux versions

Aug 27 '21 19:08 hmontero1205

@ownia Can you please pull down and test again? I learned that I was using OPTIONAL incorrectly which hid other functions which may have been problematic (like kzfree).

Aug 27 '21 19:08 hmontero1205

please see 1da05d9 and let me know if there is a more idiomatic way to handle function renaming across linux versions

Looks good to me. The code is more readable this way.

Aug 27 '21 20:08 euspectre

@hmontero1205 I am curious about how you use KEDR in your OS class. I mean, which its features are you using or plan to use; what could be improved (in addition to support for new kernels, of course). This is not strictly related to this PR, so, if you prefer, you could write to me (euspectre (at) gmail (dot) com) directly. I have probably asked you about that earlier but the feedback is always appreciated.

I have been thinking about rewriting KEDR to make the instrumentation system more portable and to get rid of code generation (the fact that KEDR generates function handlers during the build makes it impossible to submit it to mainline, among other things).

However, I am not sure what people need from KEDR, why they choose it rather than the in-kernel tracing and debugging tools. That is, which features should be implemented in the new KEDR in the first place?

For example, how important is that KEDR does not require rebuilding the kernel and rebooting the system?

KEDR cannot analyze the code from vmlinux at the moment, only the code from modules. Would it be helpful for your use cases if KEDR could attach to any part of the kernel, both from vmlinux and from the modules?

If you use the memory leak detector from KEDR, then kmemleak from the debug-enabled kernels probably does not suit your needs. Why?

Same question goes for fault simulation if you use it.

Thanks in advance.

Aug 27 '21 20:08 euspectre

Our OS class is an introduction to various parts of an operating system (syscalls, synchronization, scheduling, memory, filesystem) through the perspective of linux. Our assignments involve hacking the linux kernel to implement something new (new syscall, new in-kernel data structure, new scheduler, in-software page table traversal tool, new filesystem).

Since our students more often than not do not have access to wickedly powerful machines, kernel compilation times tend to be painful and so the majority of our assignments opt for kernel modules for some version we choose in the beginning of the semester. This way, the number of times students have to compile the kernel from scratch is significantly reduced.

For our first assignment, we introduce kernel modules before we introduce kernel compilation, which is why we've used KEDR for years now. We don't want to overwhelm students with kernel compilation in the first week of the semester.

The rest of the assignments do involve kernel compilation, so there is actually no logistical reason for us not to use kmemleak. We just choose not to because KEDR is simpler to use and is more than enough for our purposes: checking if a given assignment's kernel module doesn't leak memory. We think kmemleak is overkill for this simple use case.

And so, I think the appeal of KEDR (for our students, who have little experience with kernel development) is that it's incredibly easy to set up for basic use cases and it does not require kernel compilation. I'm not sure if your intended audience should be us, though. I'm sure there are benefits from requiring kernel recompilation/analyzing vmlinux, but I doubt we'd ever need them. Perhaps other KEDR users would though :^)

Aug 27 '21 20:08 hmontero1205

@hmontero1205 I see. Thanks!

Aug 28 '21 05:08 euspectre

@ownia Can you please pull down and test again? I learned that I was using OPTIONAL incorrectly which hid other functions which may have been problematic (like kzfree).

Ok, tested and passed.

Aug 30 '21 01:08 ownia