ebpf-verifier icon indicating copy to clipboard operation
ebpf-verifier copied to clipboard

Alternative implementation for the CFG computation

Open Maxime-Derri opened this issue 2 months ago • 0 comments

The instructions of each function are stored in a single instruction vector, which is passed as a parameter to from_sequence. Initially, a basic block is created for each instruction. Then, Assume instructions are added into the goto and fallthrough paths. Each call to a subprogram is then unrolled: the basic blocks of the called function are duplicated and inserted into the CFG by linking the corresponding labels and updating the stack frame prefixes. Subprograms are unrolled for every call — not only in the main function. This means that calls within original subprograms are also unrolled, even if they are not “checked” by the forward analyzer, as their pre-invariants are set to bottom. This leads to unnecessary operations, and the subprogram unrolling is both time- and memory-consuming.

Ideas for a possible alternative implementation:

  • The Program struct generated by from_sequence could store, for each function a list of instructions, a CFG, and a WTO. The function locations (start and end indices) can be determined in read_elf when subprograms are inserted into RawProg structs. As there are 128-bit instructions, these locations must be updated in unmarshal, where the vector of labeled instructions is created. Then, in from_sequence, a CFG can be computed separately for each function, as well as a WTO. Each CFG, WTO, and instruction list could be stored in maps (label, struct) in the Program structure, where label is the entry point of each function and struct is a list of instructions, a CFG or a WTO.
  • As subprograms are no longer unrolled, the stack frame prefix must be extracted from labels. It can be computed and managed dynamically when walking through the CFGs. Moreover, we need to set the special field of labels for CallLocal instructions, so they can be handled before transform_to_post to set up the environment for the next subprogram to check. The special field must also be set for other instruction types that use the stack frame prefix, such as Call, CallLocal, and Exit. The stack frame prefix could be copied into invariants when needed.
  • Some checks, such as verifying the number of stack frames, can be moved into the forward analyzer.
  • The forward analyzer needs two std::stack to manage stack frames: one for keys to access CFGs and WTOs, and another for stack frame prefixes.

See #948 for a prototype.

Maxime-Derri avatar Nov 13 '25 09:11 Maxime-Derri