bcc icon indicating copy to clipboard operation
bcc copied to clipboard

How to use bpf_override_return() on functions NOT in kernel whitelist

Open gitdhar opened this issue 4 years ago • 10 comments

We need to be able to allow/disallow tcp_v4_connect() under certain conditions (say for a specific dest port/ip). And since this fn. is not "white listed" function, we cannot use the bpf_overriede_return(). Can someone please clarify why this is the case and how we can accomplish this?

gitdhar avatar Aug 06 '19 18:08 gitdhar

Yes, this is an opt-in feature. You need to add the function to err-injection framework. Adding a new function to error injection framework is not complicated. See the example patch https://patchwork.kernel.org/patch/10730673/

Note that this is kprobe based. What you have is function parameters. In your case, you should be able to get it from sock and sockaddr, I guess.

yonghong-song avatar Aug 06 '19 21:08 yonghong-song

BTW, bcc has a tool inject.py for error injections. As the opt-in phase increased, this tool will need changes (mostly a few tables) to stay up-to-date. Please consider contribute if your new tcp_v4_connect() opt-in is accepted.

yonghong-song avatar Aug 06 '19 21:08 yonghong-song

Thank you @yonghong-song for the response.

As I'm new to iovisor-bcc, I need to clarify a couple things.

  1. To enforce a filter using bpf_override_return() on a particular syscall, say, tcp_v4_connect(), since it is not in "kernel white list" we have to patch the kernel?
  2. We have to rely on the existing kernel version (eg. 4.16.0-041600-generic on Ubuntu 16.04.6) on the distribution used at a customer site. Is there any other way to do this via eBPF without going through bpf_override_return()?

gitdhar avatar Aug 08 '19 17:08 gitdhar

Answer 1: Yes, you need to patch the kernel. Answer 2: I am not aware of other mechanisms. The tcpbpf infrastructure has a callback in tcp_connect() function. But it does not allow you to change connect functionality.

yonghong-song avatar Aug 08 '19 20:08 yonghong-song

@gitdhar @yonghong-song As int tcp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)'s PARM1 and PRAM2 is passed by reference, so can we check a specific dest port/ip then modify with bpf_probe_write_user to let tcp_v4_connect fail "naturally"?

ethercflow avatar Jan 20 '20 03:01 ethercflow

I am not sure how you could do that. the sk and uaddr actually are kernel addresses. They are not marked with __user. Checking the kernel code, the information is actually copied from user to kernel.

The bpf_probe_write_user() writes to user space of the "current" task. If you can find the user address through "current" or "uaddr", you may be able to do that. But bpf_probe_write_user is dangerous and may crash user application and is recommended for experimental use only.

yonghong-song avatar Jan 20 '20 04:01 yonghong-song

I am not sure how you could do that. the sk and uaddr actually are kernel addresses. They are not marked with __user. Checking the kernel code, the information is actually copied from user to kernel.

Sorry I gave wrong function, I think it should be int __sys_connect(int fd, struct sockaddr __user *uservaddr, int addrlen) (But I'm not tried this yet, I only used to inject gettimeofday by this way, which also needs set clock source to hpet to make vDSO fallback to syscall)

The bpf_probe_write_user() writes to user space of the "current" task. If you can find the user address through "current" or "uaddr", you may be able to do that. But bpf_probe_write_user is dangerous and may crash user application and is recommended for experimental use only.

Thank you for your explanation, and I expect secure fault injection capabilities through bpf without modifying the kernel, as most of our customers don't accept custom kernel or kernel module. I heard sleepable and preemptible BPF programs may appear in the future. Will it support to handle userspace page fault and secure enough to make fault injection? Thank you

ethercflow avatar Jan 20 '20 08:01 ethercflow

Yes, your example above should work.

Regarding to sleepable and preemptible BPF programs, yes, they may appear in the future. Currently, there is an effort to make BPF working better with RT (RealTime) kernel in which case, BPF program may need to be preemptible. No concrete design yet.

yonghong-song avatar Jan 21 '20 04:01 yonghong-song

How to know whether if a kernel function is whitelisted or not?

MetaT1an avatar Jul 03 '22 06:07 MetaT1an

Try the following command,

cat /proc/kallsyms | grep _eil_addr

The list will contain all error injection-able functions. Most syscalls, but some other functions as well.

ffffffff83a87fe0 d _eil_addr___ia32_sys_quotactl                                                   
ffffffff83a87ff0 d _eil_addr_btrfs_cow_block                                                       
ffffffff83a88000 d _eil_addr_btrfs_search_slot                                                     
ffffffff83a88010 d _eil_addr_open_ctree                                                            
ffffffff83a88020 d _eil_addr_io_ctl_init                                                           
ffffffff83a88030 d _eil_addr_btrfs_should_cancel_balance                                           
ffffffff83a88040 d _eil_addr_btrfs_check_leaf_full                                                 
ffffffff83a88050 d _eil_addr_btrfs_check_node                                                      
ffffffff83a88060 d _eil_addr___x64_sys_msgget                                                      
ffffffff83a88070 d _eil_addr___ia32_sys_msgget    

The config CONFIG_FUNCTION_ERROR_INJECTION is necessary. You can check the file linux/include/asm-generic/error-injection.h

#define ALLOW_ERROR_INJECTION(fname, _etype)                            \
static struct error_injection_entry __used                              \
        __section("_error_injection_whitelist")                         \
        _eil_addr_##fname = {                                           \
                .addr = (unsigned long)fname,                           \
                .etype = EI_ETYPE_##_etype,                             \
        }

yonghong-song avatar Jul 11 '22 04:07 yonghong-song

root@localhost:/bcc# cat /proc/kallsyms | grep _eil_addr ...... ffffffe67516ae60 d _eil_addr___arm64_sys_llseek ffffffe67516ae70 d _eil_addr_vfs_read ffffffe67516ae80 d _eil_addr___arm64_sys_read ffffffe67516ae90 d _eil_addr___arm64_sys_write ......

I add the vfs_read to the whitelisted,but still have problem in bpf_override_return()

root@localhost:/bcc/examples# cat hello_world.py

from bcc import BPF

BPF(text='int kprobe__vfs_read(void *ctx) { bpf_trace_printk("Hello, World!\n"); bpf_override_return(ctx, 0); return 0; }').trace_print()

./hello_world.py ioctl(PERF_EVENT_IOC_SET_BPF): Invalid argument Traceback (most recent call last): File "/bcc/examples/./hello_world.py", line 12, in BPF(text='int kprobe__vfs_read(void *ctx) { bpf_trace_printk("Hello, World!\n"); bpf_override_return(ctx, 0); return 0; }').trace_print() File "/usr/lib/python3/dist-packages/bcc/init.py", line 487, in init self._trace_autoload() File "/usr/lib/python3/dist-packages/bcc/init.py", line 1456, in _trace_autoload self.attach_kprobe( File "/usr/lib/python3/dist-packages/bcc/init.py", line 845, in attach_kprobe raise Exception("Failed to attach BPF program %s to kprobe %s" Exception: Failed to attach BPF program b'kprobe__vfs_read' to kprobe b'vfs_read', it's not traceable (either non-existing, inlined, or marked as "notrace") root@localhost:/bcc/examples#

huangshaobo avatar Dec 22 '22 13:12 huangshaobo

Do you have CONFIG_BPF_KPROBE_OVERRIDE in your kernel config?

yonghong-song avatar Dec 31 '22 19:12 yonghong-song

@yonghong-song Yes, I have img_v2_f2076ed8-e10b-47f0-9f4b-0d8b54ed1ael

img_v2_502557f5-b16b-4065-bd34-4f3c9a1094el

This problem has bothered me for a long time,can you help me?

Run on Android platform,Use adeb,other example is ok,only bpf_override_return have some error... img_v2_c78c4636-20cc-4235-9270-edc43d88ed3l

add bpf_override_return API img_v2_3ed1c612-a555-4495-8760-6d2d79a6144l

Br

huangshaobo avatar Jan 03 '23 09:01 huangshaobo

People in kernel community pretty concerned about bpf_override_return() as it might change kernel behavior and crash the kernel. That is why only limited places bpf_override_return() is supported.

If you want bpf_override_return() for a particular function, feel free to submit a kernel patch or mention here and somebody might help to craft a patch.

yonghong-song avatar Jan 19 '23 22:01 yonghong-song