redbpf icon indicating copy to clipboard operation
redbpf copied to clipboard

BPF verifier rejects code using bpf_fib_lookup

Open kbknapp opened this issue 4 years ago • 3 comments

When writing a tc BPF program, one section performs a bpf_fib_lookup (bpf-helpers(7)). However, when initializing the struct used as the params argument (*mut bpf_fib_lookup) the verifier rejects the program with a invalid read from stack off -34+0 size 1.

I believe this is due to Rust treating the anonymous unions as fully initialized, even if one of the smaller variants is used, leaving padding bytes uninitialized. Since the verifier requires all memory accessed be initialized, the code is rejected. Granted, I'm open to being 100% wrong.

Here is the subset of the code that is causing the failure:

            let mut params = bpf_fib_lookup_struct {
                family: AF_INET as u8,
                l4_protocol: IPPROTO_TCP as u8,
                tot_len: skb.load(ip_start + offset_of!(iphdr, tot_len))?,
                __bindgen_anon_1: bpf_fib_lookup__bindgen_ty_1 {
                    tos: (skb.load::<u8>(ip_start + offset_of!(iphdr, tos))?),
                },
                __bindgen_anon_2: bpf_fib_lookup__bindgen_ty_2 {
                    ipv4_src: (skb.load::<u32>(ip_start + offset_of!(iphdr, saddr))?).to_be(),
                },
                __bindgen_anon_3: bpf_fib_lookup__bindgen_ty_3 {
                    ipv4_dst: (skb.load::<u32>(ip_start + offset_of!(iphdr, daddr))?).to_be(),
                },
                ..mem::zeroed::<bpf_fib_lookup_struct>()
            };

            let ret = bpf_fib_lookup(
                skb.skb as *mut _,
                &mut params as *mut _,
                size_of::<bpf_fib_lookup_struct>() as i32,
                0,
            );

The relevant section the initialization of __bindgen_anon_{1,2,3} where the variant used is smaller than the full union.

I've done other experiments by manually initializing all fields to their default values (instead of using ..mem::zeroed::<bpf_fib_lookup_struct>()) but same result.

Here is the verifiers rejection:

Prog section 'tc_action/hairpin' rejected: Permission denied (13)!
 - Type:         3
 - Instructions: 325 (0 over limit)
 - License:      GPL

Verifier analysis:

Skipped 4630 bytes, use 'verb' option for the full verbose log.
[...]
inv0 R8=inv(id=0,umax_value=76,var_off=(0x0; 0x7c)) R9=inv(id=0) R10=fp0 fp-104=????mmmm fp-120=????mmmm fp-128=mmmmmmmm fp-136=mmmmmmmm fp-144=mmmmmmmm
118: (6b) *(u16 *)(r10 -108) = r1
119: (b7) r7 = 0
120: (6b) *(u16 *)(r10 -106) = r7
last_idx 120 first_idx 118
regs=80 stack=0 before 119: (b7) r7 = 0
121: (18) r1 = 0x9df5e389
123: (63) *(u32 *)(r10 -112) = r1
124: (bf) r2 = r10
125: (07) r2 += -112
126: (bf) r3 = r10
127: (07) r3 += -128
128: (18) r1 = 0xffff88a6f5694800
130: (b7) r4 = 0
131: (85) call bpf_map_update_elem#2
132: (61) r3 = *(u32 *)(r10 -124)
133: (bf) r1 = r6
134: (b7) r2 = 24
135: (18) r4 = 0x9df5e389
137: (b7) r5 = 4
138: (85) call bpf_l3_csum_replace#10
139: (61) r3 = *(u32 *)(r10 -128)
140: (bf) r1 = r6
141: (b7) r2 = 24
142: (18) r4 = 0x9f41e8c2
144: (b7) r5 = 4
145: (85) call bpf_l3_csum_replace#10
146: (18) r1 = 0x89e3f59d
148: (63) *(u32 *)(r10 -104) = r1
149: (bf) r3 = r10
150: (07) r3 += -104
151: (bf) r1 = r6
152: (b7) r2 = 30
153: (b7) r4 = 4
154: (b7) r5 = 0
155: (85) call bpf_skb_store_bytes#9
last_idx 155 first_idx 139
regs=10 stack=0 before 154: (b7) r5 = 0
regs=10 stack=0 before 153: (b7) r4 = 4
156: (18) r1 = 0xc2e8419f
158: (63) *(u32 *)(r10 -104) = r1
159: (bf) r3 = r10
160: (07) r3 += -104
161: (bf) r1 = r6
162: (b7) r2 = 26
163: (b7) r4 = 4
164: (b7) r5 = 0
165: (85) call bpf_skb_store_bytes#9
last_idx 165 first_idx 156
regs=10 stack=0 before 164: (b7) r5 = 0
regs=10 stack=0 before 163: (b7) r4 = 4
166: (bf) r3 = r10
167: (07) r3 += -16
168: (bf) r1 = r6
169: (b7) r2 = 16
170: (b7) r4 = 2
171: (85) call bpf_skb_load_bytes#26
last_idx 171 first_idx 156
regs=10 stack=0 before 170: (b7) r4 = 2
172: (67) r0 <<= 32
173: (c7) r0 s>>= 32
174: (6d) if r7 s> r0 goto pc-59
 R0_w=inv(id=0,umax_value=2147483647,var_off=(0x0; 0x7fffffff)) R6=ctx(id=0,off=0,imm=0) R7=invP0 R8=inv(id=0,umax_value=76,var_off=(0x0; 0x7c)) R9=inv(id=0) R10=fp0 fp-16=??????mm fp-104=????mmmm fp-112=mmmmmmmm fp-120=????mmmm fp-128=mmmmmmmm fp-136=mmmmmmmm fp-144=mmmmmmmm
175: (69) r8 = *(u16 *)(r10 -16)
176: (bf) r3 = r10
177: (07) r3 += -16
178: (bf) r1 = r6
179: (b7) r2 = 15
180: (b7) r4 = 1
181: (85) call bpf_skb_load_bytes#26
last_idx 181 first_idx 172
regs=10 stack=0 before 180: (b7) r4 = 1
182: (67) r0 <<= 32
183: (c7) r0 s>>= 32
184: (6d) if r7 s> r0 goto pc-70
 R0_w=inv(id=0,umax_value=2147483647,var_off=(0x0; 0x7fffffff)) R6=ctx(id=0,off=0,imm=0) R7=invP0 R8=inv(id=0,umax_value=65535,var_off=(0x0; 0xffff)) R9=inv(id=0) R10=fp0 fp-16=??????mm fp-104=????mmmm fp-112=mmmmmmmm fp-120=????mmmm fp-128=mmmmmmmm fp-136=mmmmmmmm fp-144=mmmmmmmm
185: (71) r9 = *(u8 *)(r10 -16)
186: (bf) r3 = r10
187: (07) r3 += -16
188: (bf) r1 = r6
189: (b7) r2 = 26
190: (b7) r4 = 4
191: (85) call bpf_skb_load_bytes#26
last_idx 191 first_idx 182
regs=10 stack=0 before 190: (b7) r4 = 4
192: (67) r0 <<= 32
193: (c7) r0 s>>= 32
194: (b7) r7 = 0
195: (6d) if r7 s> r0 goto pc-80
 R0_w=inv(id=0,umax_value=2147483647,var_off=(0x0; 0x7fffffff)) R6=ctx(id=0,off=0,imm=0) R7_w=inv0 R8=inv(id=0,umax_value=65535,var_off=(0x0; 0xffff)) R9=inv(id=0,umax_value=255,var_off=(0x0; 0xff)) R10=fp0 fp-16=????mmmm fp-104=????mmmm fp-112=mmmmmmmm fp-120=????mmmm fp-128=mmmmmmmm fp-136=mmmmmmmm fp-144=mmmmmmmm
196: (61) r1 = *(u32 *)(r10 -16)
197: (7b) *(u64 *)(r10 -136) = r1
198: (bf) r3 = r10
199: (07) r3 += -4
200: (bf) r1 = r6
201: (b7) r2 = 26
202: (b7) r4 = 4
203: (85) call bpf_skb_load_bytes#26
last_idx 203 first_idx 192
regs=10 stack=0 before 202: (b7) r4 = 4
204: (67) r0 <<= 32
205: (c7) r0 s>>= 32
206: (6d) if r7 s> r0 goto pc-92
 R0_w=inv(id=0,umax_value=2147483647,var_off=(0x0; 0x7fffffff)) R6=ctx(id=0,off=0,imm=0) R7=inv0 R8=inv(id=0,umax_value=65535,var_off=(0x0; 0xffff)) R9=inv(id=0,umax_value=255,var_off=(0x0; 0xff)) R10=fp0 fp-8=mmmm???? fp-16=????mmmm fp-104=????mmmm fp-112=mmmmmmmm fp-120=????mmmm fp-128=mmmmmmmm fp-136=mmmmmmmm fp-144=mmmmmmmm
207: (dc) r8 = be16 r8
208: (71) r1 = *(u8 *)(r10 -34)
invalid read from stack off -34+0 size 1
processed 255 insns (limit 1000000) max_states_per_insn 0 total_states 19 peak_states 19 mark_read 10

Error fetching program/map!
Unable to load program

kbknapp avatar Nov 12 '20 17:11 kbknapp

As another experiment I re-wrote bpf_fib_lookup (the struct) manually using structs instead of unions (with padding bytes manually inserted), and the verifier accepts that. So I'm pretty certain its the unions, and either Rust or bindgen's handling of them that is causing this issue.

Here is the simplified version just used to test the idea, which loads correctly:

#[repr(C)]
pub struct bpf_fib_lookup__tos {
    tos: u8,
    __pad: [u8; 3],
}
#[repr(C)]
pub struct bpf_fib_lookup__ip {
    ip: u32, // be
    __pad: [u32; 3],
}

#[repr(C)]
pub struct my_bpf_fib_lookup_struct {
    family: u8,
    l4_protocol: u8,
    sport: u16, // be
    dport: u16, // be
    tot_len: u16,
    ifindex: u32,
    tos: bpf_fib_lookup__tos,
    src: bpf_fib_lookup__ip,
    dst: bpf_fib_lookup__ip,
    h_vlan_proto: u16, // be
    h_vlan_TCI: u16,   // be
    smac: [u8; 6],
    dmac: [u8; 6],
}

// ... 

            let mut params = my_bpf_fib_lookup_struct {
                family: AF_INET as u8,
                l4_protocol: IPPROTO_TCP as u8,
                tot_len: skb.load(ip_start + offset_of!(iphdr, tot_len))?,
                tos: bpf_fib_lookup__tos {
                    tos: (skb.load::<u8>(ip_start + offset_of!(iphdr, tos))?),
                    __pad: [0u8; 3],
                },
                src: bpf_fib_lookup__ip {
                    ip: (skb.load::<u32>(ip_start + offset_of!(iphdr, saddr))?).to_be(),
                    __pad: [0u32; 3],
                },
                dst: bpf_fib_lookup__ip {
                    ip: (skb.load::<u32>(ip_start + offset_of!(iphdr, daddr))?).to_be(),
                    __pad: [0u32; 3],
                },
                ..mem::zeroed::<my_bpf_fib_lookup_struct>()
            };

            let ret = bpf_fib_lookup(
                skb.skb as *mut _,
                &mut params as *mut _ as *mut _,
                size_of::<my_bpf_fib_lookup_struct>() as i32,
                0,
            );

kbknapp avatar Nov 12 '20 17:11 kbknapp

I'm not an assembly expert, but it looks to me like instantiating a union with one of the smaller variants does not initialize all memory of that union.

union FooUnion {
    a: u8,
    b: u64
}

impl FooUnion{
    fn new() -> Self {
        FooUnion {
            a: 0        
        }
    }
}

Results in the following asm (Rust 1.47):

example::FooUnion::new:
        push    rax
        mov     byte ptr [rsp], 0
        mov     rax, qword ptr [rsp]
        pop     rcx
        ret

The Rust reference say the compiler treats all memory as initialized as reading from the variants requires unsafe, but the BPF verifier requires the memory actually be initialized.

This is my understanding at least.

kbknapp avatar Nov 13 '20 17:11 kbknapp

Sorry for taking a while to respond, I had to kind of sit with this to understand what's going on. I'm not surprised at this, and I think you're on the right track. It was always my impression that bindgen interfaces are not super usable as a data structure to construct directly, only as a read-only view into the kernel.

I see two approaches to resolving this: since you already identified that fully initializing the structure is a viable way of passing the verifier, I'd be open to accept a PR that implements this specific API in a dedicated redbpf-probes namespace, with some idiomatic wrapper/constructor, and gradually build up a Rust API around bpf-helpers.

Another approach that I see is automatically generating high-level constructors like we do with accessors that wrap bpf_probe_read already, based on the output of bindgen. This generic approach is going to be more robust, but also potentially a bit of a rabbit hole, complete with type rewriting and logic to detect unions that actually need to be initialized.

rsdy avatar Nov 16 '20 15:11 rsdy