xdp-tutorial icon indicating copy to clipboard operation
xdp-tutorial copied to clipboard

advanced03-AF_XDP kernel filter issues

Open kewinrausch opened this issue 4 years ago • 8 comments

Hi everyone,

I'm investigating the AF_XDP technology and I encountered a problem while compiling and trying to use the af_xdp_kern.c source file as-it-is. While compiling the filter triggers no error, trying to load it (x86_64 architecture, at least) will lead to an error of the in-kernel BPF verifier:

libbpf: load bpf program failed: Invalid argument
libbpf: -- BEGIN DUMP LOG ---
libbpf: 
0: (61) r1 = *(u32 *)(r1 +16)
1: (63) *(u32 *)(r10 -4) = r1
2: (bf) r2 = r10
3: (07) r2 += -4
4: (18) r1 = 0xffff952a7b59c000
6: (85) call bpf_map_lookup_elem#1
7: (15) if r0 == 0x0 goto pc+7
 R0=map_value(id=0,off=0,ks=4,vs=4,imm=0) R10=fp0,call_-1 fp-8=mmmm????
8: (61) r1 = *(u32 *)(r0 +0)
 R0=map_value(id=0,off=0,ks=4,vs=4,imm=0) R10=fp0,call_-1 fp-8=mmmm????
9: (bf) r2 = r1
10: (07) r2 += 1
11: (63) *(u32 *)(r0 +0) = r2
 R0=map_value(id=0,off=0,ks=4,vs=4,imm=0) R1_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R2_w=inv(id=0,umin_value=1,umax_value=4294967296,var_off=(0x0; 0x1ffffffff)) R10=fp0,call_-1 fp-8=mmmm????
12: (b7) r0 = 2
13: (57) r1 &= 1
14: (55) if r1 != 0x0 goto pc+13
 R0=inv2 R1=inv0 R2=inv(id=0,umin_value=1,umax_value=4294967296,var_off=(0x0; 0x1ffffffff)) R10=fp0,call_-1 fp-8=mmmm????
15: (bf) r2 = r10
16: (07) r2 += -4
17: (18) r1 = 0xffff952a7c78ce00
19: (85) call bpf_map_lookup_elem#1
cannot pass map_type 17 into func bpf_map_lookup_elem#1
processed 18 insns (limit 1000000) max_states_per_insn 0 total_states 3 peak_states 3 mark_read 3

libbpf: -- END LOG --
libbpf: failed to load program 'xdp_sock'
libbpf: failed to load object './af_xdp_kern.o'
ERR: loading BPF-OBJ file(./af_xdp_kern.o) (-22): Invalid argument
ERR: loading file: ./af_xdp_kern.o

Investigation of the kernel verifier and the BPF tools libbpf lead me to the discovery that the filter introduced in the example is similar to the "default" one present in the libbpf library, with one small difference: the one provided in the example lack the xsk_lookup_bpf_maps, which is a necessary component for the library and the XSK subsystem.

In fact in the kernel 5.2.11, file: tools/lib/bpf/xsk/c:448, function xsk_lookup_bpf_maps:

if (xsk->qidconf_map_fd < 0 || xsk->xsks_map_fd < 0) {
        err = -ENOENT;
        xsk_delete_bpf_maps(xsk);
}

force the map to exists in each custom filter using XSK technologies, or the error occurs.

Such error can be fixed by introducing in the af_xdp_kern.c file the following modification:

struct bpf_map_def SEC("maps") qidconf_map = {
	.type = BPF_MAP_TYPE_ARRAY,
	.key_size = sizeof(int),
	.value_size = sizeof(int),
	.max_entries = 64,
};

Cheers, Kewin R.

kewinrausch avatar Aug 30 '19 08:08 kewinrausch

I wrote this example against the latest bpf-next branch, and I think there this map is not needed.

@netoptimizer do you know how we would like to handle these, to which kernel do we build against to?

chaudron avatar Aug 30 '19 09:08 chaudron

If you want my opinion, this decision probably depends on who is the target for such tutorials.

If you are presenting the tech to kernel hackers, going for the beta/testing branches probably is fine (but you need to point it in the documentation), otherwise is better to go for a stable release (and again point to the release in the documentation).

In any case I would say is better to write somewhere against which version/branch/whatever the tutorial has been compiled. In my case was 5.2.11, since 5.2.x seems to be the version where also AF_XDP technology appears.

kewinrausch avatar Sep 02 '19 07:09 kewinrausch

The error feedback:

19: (85) call bpf_map_lookup_elem#1
cannot pass map_type 17 into func bpf_map_lookup_elem#1

Looks like you are doing a lookup in a AF_XDP (xskmap) which is only supported in later kernels... I think it was @tohojo that implemented that recently.

Around this kernel commit: 8daed7677a1da

So, I suggest ... we should implement two versions of the BPF code, and then based on detecting if this eBPF facility is supported or not, then choose which to load.

netoptimizer avatar Sep 02 '19 09:09 netoptimizer

Jesper Dangaard Brouer [email protected] writes:

The error feedback:

19: (85) call bpf_map_lookup_elem#1
cannot pass map_type 17 into func bpf_map_lookup_elem#1

Looks like you are doing a lookup in a AF_XDP (xskmap) which is only supported in later kernels... I think it was @tohojo that implemented that recently.

Yeah, lookup in xskmap is quite recent (5.2 or 5.3?). Not me that implemented that, though :)

As far as the kernel version targeted by the tutorial, for all our tutorial presentations (e.g., at netdevconf) have been saying version 4.19+. It doesn't appear that this is actually written down anywhere, though, but we probably should; and if we move to a later version as the required minimum, that should be a deliberate decision...

tohojo avatar Sep 02 '19 09:09 tohojo

@kewinrausch

advanced03-AF_XDP is ok in my environment.

My environment is Fedora29 with bpf-next kernel v5.3-rc1 ( https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git )

My xdp-tutorial git from https://github.com/chaudron/xdp-tutorial

You also can refer to: https://github.com/xdp-project/xdp-tutorial/issues/60#issue-482401608

hi-glenn avatar Sep 03 '19 03:09 hi-glenn

@netoptimizer The filter shipped by the tutorial is trying to lookup the socket map instead of going for the queue_id one. I quickly fixed the program by looking at tools/lib/bpf/xsk.c:282 on the kernel 5.2.11 downloadable with the big yellow button on kernel.org (don't know how to be more precise than that).

There is the position of libbpf "standard" xsk filter.

I'm no trying to do nothing more than have the tutorial filter running in the kernel, right now. I did it, with the modifications listed in the first post.

@glennWang

Just to be clear: the tutorial works great and without any problem until I try to load the custom BPF filter using the --filename and --procsec options. When that is the case, with the latest stable kernel downloaded one week ago from kernel.org (the 5.2.11), the BPF verification process just block the filter from being loaded.

More in detail on the filter: by custom I mean loading the af_xdp_kern.o untouched (without any modification) using those af_xdp_user options flags.

If you run the tutorial without customizing the filter, the libbpf seems to fallback in using it's own and everything works fine. The problem arise only if you try to load the kernel part of the tutorial (compiling will have no failure) by forcing the --filename option.

kewinrausch avatar Sep 03 '19 13:09 kewinrausch

Let me figure out a solution so we can support both ways of doing this. Will try to look at this sometime next week.

chaudron avatar Sep 03 '19 13:09 chaudron

[xskmap] Well, what about at least writing in some visible places that the code needs particular kernel versions? (even some 5.2.x doesn't accept it for me, i.e. from latest stable branch) This problem got me stalled for quite some time, as the errors don't really hint that a different kernel should help. For tutorials I'd personally probably choose the older approach that should work on a wider range of kernels, but if it's visibly documented...

EDIT: thanks a lot for the tutorials, BTW :-)

vcunat avatar Sep 11 '19 07:09 vcunat