ebpf-verifier Verifying memory access in loops

I'm having an issue verifying the behavior of a program that accesses memory in a loop. The example I have is not really minimal, but I'm hoping someone might be able to describe the intended behavior of the verifier anyway.

I have written a custom ebpf_platform_t to be have a very simple context object:

struct simple_ebpf_ctx {
    uint64_t data;
    uint64_t data_end;
    uint64_t meta_data;
}

And then want to verify a program that is given the bytes of a network packet in the data of the context:

#include "simple_ebpf_defs.h"

struct packet {
    uint8_t packet_buffer[1500];
};

SEC("simple_ebpf")
uint64_t
simple_ebpf_main(void *state)
{
    struct simple_ebpf_ctx *ctx = state;
    struct packet *p = (struct packet *)ctx->data;
    uint8_t *data = p->packet_buffer;

    int i = 0;
    while (*(data + 2000 + i) == 'a') {
        i++;
    }

    return 0;
}

I would expect (with --termination) that this should not be verified, since it contains a potentially loop that doesn't necessarily terminate and is not checking to make sure the data accesses are between data and data_end. But it passes.

Is this something that prevail should be able to do? Or am I misunderstanding? Thanks!

Jun 02 '25 20:06 cjdoucette

Thank you for the report! This definitely looks like a bug to me. Can you provide the object file, for reproducibility?

Jun 02 '25 20:06 elazarg

For sure! Although, I did just realize that the program fails verification (as expected) if compiled with no optimizations, and passes if there are optimizations (clang -O1). Maybe this is expected?

Object files here: https://github.com/cjdoucette/sample-object-file

Jun 02 '25 20:06 cjdoucette

llvm-objdump -D O1_pass.o:

Disassembly of section cf_ebpf_generic:

0000000000000000 <cf_ebpf_main>:
       0:	b7 00 00 00 00 00 00 00	r0 = 0
       1:	95 00 00 00 00 00 00 00	exit

So with -O1 the loop is gone and the program becomes trivial. That's why prevail doesn't complain.

I'm confused about your example. It looks like it's accessing out of bounds because the size of packet_buffer is 1500 bytes and you already access in the first iteration at offset 2000. Does prevail complain about memory safety?

Note that when the LLVM compiler detects undefined behavior (e.g., out-of-bounds, non-terminating loops, null de-references, etc) will usually remove aggressively code. I think this is what is happening here. But even if there is no undefined behavior but the loop does not have side effects (e.g., write to memory) then LLVM optimizations still might remove the loop.

Jun 04 '25 01:06 caballa

I'm confused about your example. It looks like it's accessing out of bounds because the size of packet_buffer is 1500 bytes and you already access in the first iteration at offset 2000.

It's a toy example to try to demonstrate that prevail will catch these types of unsafe accesses.

Does prevail complain about memory safety?

For the unoptimized object file, yes, it does:

16: Lower bound must be at least meta_offset (valid_access(r1.offset+2000, width=1) for read)
16: Upper bound must be at most packet_size (valid_access(r1.offset+2000, width=1) for read)

0,0.149138,7472

Note that when the LLVM compiler detects undefined behavior (e.g., out-of-bounds, non-terminating loops, null de-references, etc) will usually remove aggressively code. I think this is what is happening here. But even if there is no undefined behavior but the loop does not have side effects (e.g., write to memory) then LLVM optimizations still might remove the loop.

OK, I see -- maybe the fact that it is a toy example that produces undefined behavior that's the problem here. That makes sense.

One last question: do you know if there's any precedent or discussion of doing static verification on unoptimized object files, but actually executing optimized versions of the object file? Seems like that would be the best of both worlds.

Thanks for the responses!

Jun 04 '25 20:06 cjdoucette

Great, so then prevail behaves as expected.

Regarding your question, verify with -O0 and deploy with -O1 or -O2 makes sense assuming we are happy trusting the compiler optimizations. But I'm not a eBPF developer so I don't know what they usually do in practice.

Jun 04 '25 20:06 caballa

Got it, thanks again so much!

Jun 04 '25 20:06 cjdoucette

@cjdoucette can we close this?

Jun 12 '25 15:06 elazarg