bcc
bcc copied to clipboard
How to use bpf_override_return() on functions NOT in kernel whitelist
We need to be able to allow/disallow tcp_v4_connect() under certain conditions (say for a specific dest port/ip). And since this fn. is not "white listed" function, we cannot use the bpf_overriede_return(). Can someone please clarify why this is the case and how we can accomplish this?
Yes, this is an opt-in feature. You need to add the function to err-injection framework. Adding a new function to error injection framework is not complicated. See the example patch https://patchwork.kernel.org/patch/10730673/
Note that this is kprobe based. What you have is function parameters. In your case, you should be able to get it from sock and sockaddr, I guess.
BTW, bcc has a tool inject.py
for error injections. As the opt-in phase increased, this tool will need changes (mostly a few tables) to stay up-to-date. Please consider contribute if your new tcp_v4_connect() opt-in is accepted.
Thank you @yonghong-song for the response.
As I'm new to iovisor-bcc, I need to clarify a couple things.
- To enforce a filter using bpf_override_return() on a particular syscall, say, tcp_v4_connect(), since it is not in "kernel white list" we have to patch the kernel?
- We have to rely on the existing kernel version (eg. 4.16.0-041600-generic on Ubuntu 16.04.6) on the distribution used at a customer site. Is there any other way to do this via eBPF without going through bpf_override_return()?
Answer 1: Yes, you need to patch the kernel. Answer 2: I am not aware of other mechanisms. The tcpbpf infrastructure has a callback in tcp_connect() function. But it does not allow you to change connect functionality.
@gitdhar @yonghong-song As
int tcp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
's PARM1 and PRAM2 is passed by reference, so can we check a specific dest port/ip then modify with bpf_probe_write_user
to let tcp_v4_connect fail "naturally"?
I am not sure how you could do that. the sk
and uaddr
actually are kernel addresses. They are not marked with __user. Checking the kernel code, the information is actually copied from user to kernel.
The bpf_probe_write_user() writes to user space of the "current" task. If you can find the user address through "current" or "uaddr", you may be able to do that. But bpf_probe_write_user
is dangerous and may crash user application and is recommended for experimental use only.
I am not sure how you could do that. the
sk
anduaddr
actually are kernel addresses. They are not marked with __user. Checking the kernel code, the information is actually copied from user to kernel.
Sorry I gave wrong function, I think it should be
int __sys_connect(int fd, struct sockaddr __user *uservaddr, int addrlen)
(But I'm not tried this yet, I only used to inject gettimeofday
by this way, which also needs set clock source to hpet to make vDSO fallback to syscall)
The bpf_probe_write_user() writes to user space of the "current" task. If you can find the user address through "current" or "uaddr", you may be able to do that. But
bpf_probe_write_user
is dangerous and may crash user application and is recommended for experimental use only.
Thank you for your explanation, and I expect secure fault injection capabilities through bpf without modifying the kernel, as most of our customers don't accept custom kernel or kernel module. I heard sleepable and preemptible BPF programs may appear in the future. Will it support to handle userspace page fault and secure enough to make fault injection? Thank you
Yes, your example above should work.
Regarding to sleepable and preemptible BPF programs, yes, they may appear in the future. Currently, there is an effort to make BPF working better with RT (RealTime) kernel in which case, BPF program may need to be preemptible. No concrete design yet.
How to know whether if a kernel function is whitelisted or not?
Try the following command,
cat /proc/kallsyms | grep _eil_addr
The list will contain all error injection-able functions. Most syscalls, but some other functions as well.
ffffffff83a87fe0 d _eil_addr___ia32_sys_quotactl
ffffffff83a87ff0 d _eil_addr_btrfs_cow_block
ffffffff83a88000 d _eil_addr_btrfs_search_slot
ffffffff83a88010 d _eil_addr_open_ctree
ffffffff83a88020 d _eil_addr_io_ctl_init
ffffffff83a88030 d _eil_addr_btrfs_should_cancel_balance
ffffffff83a88040 d _eil_addr_btrfs_check_leaf_full
ffffffff83a88050 d _eil_addr_btrfs_check_node
ffffffff83a88060 d _eil_addr___x64_sys_msgget
ffffffff83a88070 d _eil_addr___ia32_sys_msgget
The config CONFIG_FUNCTION_ERROR_INJECTION
is necessary.
You can check the file linux/include/asm-generic/error-injection.h
#define ALLOW_ERROR_INJECTION(fname, _etype) \
static struct error_injection_entry __used \
__section("_error_injection_whitelist") \
_eil_addr_##fname = { \
.addr = (unsigned long)fname, \
.etype = EI_ETYPE_##_etype, \
}
root@localhost:/bcc# cat /proc/kallsyms | grep _eil_addr ...... ffffffe67516ae60 d _eil_addr___arm64_sys_llseek ffffffe67516ae70 d _eil_addr_vfs_read ffffffe67516ae80 d _eil_addr___arm64_sys_read ffffffe67516ae90 d _eil_addr___arm64_sys_write ......
I add the vfs_read to the whitelisted,but still have problem in bpf_override_return()
root@localhost:/bcc/examples# cat hello_world.py
from bcc import BPF
BPF(text='int kprobe__vfs_read(void *ctx) { bpf_trace_printk("Hello, World!\n"); bpf_override_return(ctx, 0); return 0; }').trace_print()
./hello_world.py
ioctl(PERF_EVENT_IOC_SET_BPF): Invalid argument
Traceback (most recent call last):
File "/bcc/examples/./hello_world.py", line 12, in
Do you have CONFIG_BPF_KPROBE_OVERRIDE
in your kernel config?
@yonghong-song Yes, I have
This problem has bothered me for a long time,can you help me?
Run on Android platform,Use adeb,other example is ok,only bpf_override_return have some error...
add bpf_override_return API
Br
People in kernel community pretty concerned about bpf_override_return() as it might change kernel behavior and crash the kernel. That is why only limited places bpf_override_return() is supported.
If you want bpf_override_return() for a particular function, feel free to submit a kernel patch or mention here and somebody might help to craft a patch.