ebpf icon indicating copy to clipboard operation
ebpf copied to clipboard

Assemble a dummy BTF blob for probing StructOps maps

Open ti-mo opened this issue 4 years ago • 9 comments

With https://github.com/cilium/ebpf/pull/321, an API for probing available map types in the kernel was added. However, a StructOps map requires a valid BTF blob to be specified in order to make creation work.

@qmonnet was able to get this (partially?) working in bpftool: https://github.com/cilium/ebpf/pull/321#discussion_r662944737.

This issue is for implementing the equivalent using a pre-baked (or assembled at runtime) BTF blob to be able to probe this map type successfully.

cc @rgo3

ti-mo avatar Jul 14 '21 14:07 ti-mo

Works completely for bpftool, but I haven't submitted a patch upstream yet.

Full working patch:
diff --git a/tools/lib/bpf/libbpf_probes.c b/tools/lib/bpf/libbpf_probes.c
index ecaae2927ab8..629d39c98f10 100644
--- a/tools/lib/bpf/libbpf_probes.c
+++ b/tools/lib/bpf/libbpf_probes.c
@@ -203,6 +203,22 @@ bool bpf_probe_map_type(enum bpf_map_type map_type, __u32 ifindex)
 	__u32 btf_key_type_id = 0, btf_value_type_id = 0;
 	struct bpf_create_map_attr attr = {};
 	int fd = -1, btf_fd = -1, fd_inner;
+	int btf_vmlinux_value_type_id = 0;
+	struct btf *btf_vmlinux;
+
+	/* [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED */
+	__u8 const btf_data[] = {
+		0x9f, 0xeb, 0x01, 0x00, 0x18, 0x00, 0x00, 0x00,
+		0x00, 0x00, 0x00, 0x00, 0x30, 0x00, 0x00, 0x00,
+		0x30, 0x00, 0x00, 0x00, 0x09, 0x00, 0x00, 0x00,
+		0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01,
+		0x04, 0x00, 0x00, 0x00, 0x20, 0x00, 0x00, 0x01,
+		0x00, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x0d,
+		0x00, 0x00, 0x00, 0x00, 0x07, 0x00, 0x00, 0x00,
+		0x01, 0x00, 0x00, 0x00, 0x05, 0x00, 0x00, 0x00,
+		0x00, 0x00, 0x00, 0x0c, 0x02, 0x00, 0x00, 0x00,
+		0x00, 0x69, 0x6e, 0x74, 0x00, 0x78, 0x00, 0x61,
+		0x00 };
 
 	key_size	= sizeof(__u32);
 	value_size	= sizeof(__u32);
@@ -245,6 +261,17 @@ bool bpf_probe_map_type(enum bpf_map_type map_type, __u32 ifindex)
 		value_size = 0;
 		max_entries = 4096;
 		break;
+	case BPF_MAP_TYPE_STRUCT_OPS:
+		btf_fd = bpf_load_btf(btf_data, sizeof(btf_data), NULL, 0, false);
+		if (btf_fd < 0)
+			return false;
+		value_size = 256;
+		btf_vmlinux = libbpf_find_kernel_btf();
+		if (libbpf_get_error(btf_vmlinux))
+			return false;
+		btf_vmlinux_value_type_id = btf__find_by_name_kind(btf_vmlinux,
+		     "bpf_struct_ops_tcp_congestion_ops", BTF_KIND_STRUCT);
+		break;
 	case BPF_MAP_TYPE_UNSPEC:
 	case BPF_MAP_TYPE_HASH:
 	case BPF_MAP_TYPE_ARRAY:
@@ -264,7 +291,6 @@ bool bpf_probe_map_type(enum bpf_map_type map_type, __u32 ifindex)
 	case BPF_MAP_TYPE_XSKMAP:
 	case BPF_MAP_TYPE_SOCKHASH:
 	case BPF_MAP_TYPE_REUSEPORT_SOCKARRAY:
-	case BPF_MAP_TYPE_STRUCT_OPS:
 	default:
 		break;
 	}
@@ -292,6 +318,7 @@ bool bpf_probe_map_type(enum bpf_map_type map_type, __u32 ifindex)
 		attr.max_entries = max_entries;
 		attr.map_flags = map_flags;
 		attr.map_ifindex = ifindex;
+		attr.btf_vmlinux_value_type_id = btf_vmlinux_value_type_id;
 		if (btf_fd >= 0) {
 			attr.btf_fd = btf_fd;
 			attr.btf_key_type_id = btf_key_type_id;

(I got the BTF blob from examining (strace) the load of a BTF object created from int foo = 0; or something like that; Maybe I can find a nicer way to present it in the code before submitting. But that's a detail.)

qmonnet avatar Jul 14 '21 14:07 qmonnet

@qmonnet Thank you for the example! Looks like this does indeed rely on being able to obtain the vmlinux BTF, which requires sysfs to be mounted and accessible at /sys. Be aware (also for bpftool) that this is not guaranteed when running containerized apps, though Docker seems to mount sysfs by default. Things might be different on other container runtimes and schedulers.

Just echoing here the discussion(s) we had in this PR:

  • https://github.com/cilium/ebpf/pull/321#discussion_r662944737
  • https://github.com/cilium/ebpf/pull/321#discussion_r662954128

For now, it seems unlikely we'll be able to create a (mock) StructOps map, so best to conclude that StructOps maps are not supported from the perspective of the current process if the process can't obtain a copy of the vmlinux BTF blob.

ti-mo avatar Jul 15 '21 12:07 ti-mo

(continuation of https://github.com/cilium/ebpf/pull/321#discussion_r670461168)

So yes, the vmlinux BTF is loaded into the kernel somehow?

Yes, (all?) kernel BTF seems to be preloaded as far as I can see:

~ strace bpftool btf dump id 1
...
bpf(BPF_BTF_GET_FD_BY_ID, {btf_id=1}, 120)
...
[1] INT 'long unsigned int' size=8 bits_offset=0 nr_bits=64 encoding=(none)
... etc.

The next fd's seem to be BTFs for various other subsystems.

This makes me doubt if sysfs is really needed to obtain the vmlinux BTF blob, or maybe BPF_BTF_GET_FD_BY_ID is a more recent addition.

It still requires a way to find the id for kernel BTF, though :thinking:.

vmlinux seems to be fixed at 1, but we should make sure there's a const that can be depended on.

Then once we have this fd we could assign it to attr->btf_fd, this won't be valid and the map won't be created,

I don't see why that would be invalid. :sweat_smile: If we can obtain the vmlinux BTF reliably using a syscall and parse its graph for bpf_struct_ops_tcp_congestion_ops, we have our probe.

ti-mo avatar Jul 16 '21 12:07 ti-mo

I don't see why that would be invalid. :sweat_smile:

Because the kernel explicitly checks that this BTF object is not kernel BTF in the case of struct_ops maps, see map_create() in kernel/bpf/syscall.c:

	if ( [...] || attr->btf_vmlinux_value_type_id) {
		struct btf *btf;

		btf = btf_get_by_fd(attr->btf_fd);
		[...]
		if (btf_is_kernel(btf)) {
			btf_put(btf);
			err = -EACCES;
			goto free_map;
		}

Agreed on the other points.

qmonnet avatar Jul 20 '21 08:07 qmonnet

A random thing to keep in mind:

bpf(BPF_BTF_GET_FD_BY_ID, {btf_id=1}, 120)

That syscall requires CAP_SYS_ADMIN, so it won't work for feature probes I'd say.

lmb avatar Jul 20 '21 08:07 lmb

But then most of the probes will require some level of privilege anyway?

qmonnet avatar Jul 20 '21 09:07 qmonnet

This will become possible to implement after https://github.com/cilium/ebpf/pull/641 has been merged.

ti-mo avatar Apr 28 '22 13:04 ti-mo

Closing this as we no longer really need this for probing (see https://github.com/cilium/ebpf/pull/746) and https://github.com/cilium/ebpf/pull/641 will be pushed over the line at some point.

ti-mo avatar Jul 22 '22 14:07 ti-mo

Reopened as we'll still need to gain the ability to craft a valid StructOps program at some point for probing helper type availability.

ti-mo avatar Jul 29 '22 12:07 ti-mo

There's currently no strong driver for probing helpers in tracing, struct_ops, ext and lsm programs. Implementing these are not so trivial, as programs need to be loaded with certain attach targets, etc.

To revisit later if there is a need.

ti-mo avatar Mar 03 '23 11:03 ti-mo