libbpf icon indicating copy to clipboard operation
libbpf copied to clipboard

python interface to libbpf

Open bieganski opened this issue 2 years ago • 10 comments

newcomer question here.

eBPF tool basically consists of userspace controller program (compiled for native CPU architecture) that makes calls to libbpf, and a loadable program compiled for bpf architecture, let's call it probe program.

with CO-RE introduced, probe programs are CPU-architecture-agnostic. it's not the case for controller programs though - they need to be compiled for CPU arch, which might be cumbersome, if doing cross-compilation for example.

would there be any problems with following solution: in the same way as progname.skel.h is auto-generated, the progname.skel.py is generated, and together with libbpf python bindings one may write CPU-arch-agnostic controller program?

apologies if i'm missing something obvious.

bieganski avatar Mar 20 '24 13:03 bieganski

It definitely makes sense to have a Python user-space interface, it's only a question of who will design and implement it. If libbpf is used as low-level library for all the BPF-related things and this Python thing just provides a Python-native interfaces, that would be best. There is a lot of intricate and complex logic that libbpf itself takes care of, I'd highly discourage anyone from trying to reimplement all this. libbpf-rs, for instance, takes this approach where all the low-level functionality remains in native libbpf code, while libbpf-rs provides Rust-native interfaces on top of that (keeping BPF-side still in C).

anakryiko avatar Mar 20 '24 18:03 anakryiko

@anakryiko thank you for your response.

would such controller written purely in Python be fully CPU-arch agnostic?

i should have been more precise; it's not the case that i specifically need Python - rather i'm looking for a way of creating CPU ISA-agnostic controller program, to avoid resolving cross-compilation issues - Python looked compelling as it's interpreter is a basic tool for many Linux distros, even those stripped ones.

to summarize, would something like that work:

  • write BPF code in C (with CO-RE) and compile it on any platform you wish (you just need properly configured clang for that)
  • write Python script of a controller program to previously written BPF probe program
  • be sure that pair (python script, bpf ELF) can be run on x86-64, aarch64, riscv32 etc, with no changes in code applied, and with only runtime dependency being libbpf.so and proper kernel version? (note that old-fashion bcc does not work, as it requires clang as a kind-of-runtime dependency)

bieganski avatar Mar 20 '24 21:03 bieganski

tbh, I'd look into using Rust for this (through libbpf-rs). I believe cross-compilation with Rust isn't a big deal, so why not?

anakryiko avatar Mar 21 '24 20:03 anakryiko

@anakryiko I'm also very looking forward to add a python interface to libbpf, is there an estimate of the workload to do so? I could look into this.

Superskyyy avatar Apr 11 '24 11:04 Superskyyy

I have no idea what would it take, no Python expert, sorry.

anakryiko avatar Apr 19 '24 18:04 anakryiko

@bieganski

I was also thinking about similar idea. Also, I found below.

https://pypi.org/project/bpfmaps/ https://github.com/PeterStolz/pybpfmaps/

I guess the work could address your requirements (at least partially).

thatsdone avatar Jul 11 '24 22:07 thatsdone

@bieganski

I was also thinking about similar idea.

Also, I found below.

https://pypi.org/project/bpfmaps/

https://github.com/PeterStolz/pybpfmaps/

I guess the work could address your requirements (at least partially).

Yeah these bindings help partially and is awesome to have, but in the long go it would be great to have an easy programming interface to the full libbpf capabilities.

Superskyyy avatar Jul 12 '24 14:07 Superskyyy

@Superskyyy

but in the long go it would be great to have an easy programming interface to the full libbpf capabilities.

I agree with you. Maybe it's a good timing to start a kind of small SIG?

thatsdone avatar Jul 13 '24 05:07 thatsdone

FYI: I'm working on initial support for Python bindings for libbpf. As a first step I made a Python wrapper over simple uprobe example. In order to give it a try, there is a docker container prepared (NOTE: it assumes libc.so.6 to be present under /lib/x86_64-linux-gnu/:

$ sudo docker run -v /lib/x86_64-linux-gnu/:/mnt -it --privileged bieganski/libbpf-py:uprobe-0.1 bash -c "cd /work/bcc/libbpf-tools/ ; ./bindings.py /mnt/libc.so.6 malloc"

Successfully started! Please run `sudo cat /sys/kernel/debug/tracing/trace_pipe` to see output of the BPF programs.

In other terminal:

$ sudo cat /sys/kernel/debug/tracing/trace_pipe

 x-terminal-emul-355076  [004] ...11 632199.951945: bpf_trace_printk: uprobe hit /mnt/libc.so.6:malloc from PID 355076. args: 18,5c0ee1c98010,63,5c0ee802d008,5c0ee7f75f90,741624795301
 x-terminal-emul-355076  [004] ...11 632199.952025: bpf_trace_printk: uprobe hit /mnt/libc.so.6:malloc from PID 355076. args: 18,5c0ee1c98010,37,5c0ee802f008,5c0ee7dd81a0,741624795301
 x-terminal-emul-355076  [004] ...11 632199.952096: bpf_trace_printk: uprobe hit /mnt/libc.so.6:malloc from PID 355076. args: 18,5c0ee1c98010,20,5c0ee8031008,5c0ee7dd81c0,741624795301

In order to see the loader code, run following command:

sudo docker run bieganski/libbpf-py:uprobe-0.1 cat /work/bcc/libbpf-tools/bindings.py

In order to build the image on your own, run sudo docker build . -f /path/to/Dockerfile. The Dockerfile is available on Github: https://github.com/bieganski/bcc/blob/master/libbpf-tools/Dockerfile. The BPF program being loaded is as follows:

SEC(".data.symbol_name") static char symbol_name[64] = "MOCK_SYMBOL";
SEC(".data.library_path") static char library_path[128] = "MOCK_LIBRARY";

SEC("uprobe//")
int BPF_KPROBE(uprobe_funcname, long long arg1, long long arg2, long long arg3, long long arg4, long long arg5, long long arg6)
{
	int pid = bpf_get_current_pid_tgid() >> 32;
	bpf_printk("uprobe hit %s:%s from PID %d. args: %llx,%llx,%llx,%llx,%llx,%llx", library_path, symbol_name, pid, arg1, arg2, arg3, arg4, arg5, arg6);
	return 0;
}

Design

  • portability over performance - if someone don't mind cross-compilation at all, there are bindings for golang and rust available. Those are probably the way to go in that case.
  • I decided to use bindings based on ctypes module (other options - see https://stackoverflow.com/a/5686873). It suffers from performance problems, but the big win is cross-arch portability - bindings.py above run on any platform that has cpython-based python3 binary present. NOTE: for full portability, the BPF program itself should be cross-arch portable as well. This is unfortunately often not true at the moment (see https://github.com/libbpf/libbpf/issues/852), which prevents us to embed pre-compiled BPF program as raw bytes inside .py file and use it on any CPU architecture.
  • ctypes bindings are generated using (slightly modified) ctypesgen package. It auto-generates gen/*.py files, that bindings.py uses in form of import gen.libbpf as libbpf.

Current status

  • Proof of concept. Bugs expected.
  • Able to load a simple uprobe program, with .data and .rodata sections (tested on ubuntu 22.04 host). Program uses printk for communication.
  • Missing API for runtime communication of controller (Python) with BPF program (e.g. ringbuffer, BPF maps). In fact, there is a simple raw bpf(BPF_MAP_UPDATE_ELEM syscall invocation done, that sets library path and symbol name in BPF context, so that it can be properly printk-ed.

bieganski avatar Oct 01 '24 15:10 bieganski