fuzz-introspector CFG improvements

Fuzz-introspector relies on extracting control-flow graphs to determine reachability of the code under analysis. In addition to this, fuzz-introspector extracts more data than what is in a pure CFG and we use that data to do fine-grained analysis. However, relying on LTO and using a somewhat homegrown approach to CFG extraction may not be ideal. Other alternatives could be considered:

Non LTO-based
Extract analysis from runtime to improve CFG extraction. For example, if we run a fuzzer and observe coverage in a function that is not included in the reachability graph, then this should be included.
use other implementations of reachability/callgraph extraction: https://groups.google.com/g/llvm-dev/c/SWIiEBWaJVg/m/Jmf_8jVoAQAJ

The benefit of using our own is that it enables fast development (until technical debt grows too large), and this is of fairly high priority atm.

Dec 06 '21 21:12 DavidKorczynski

An example case in the form of systemd where some parts of the project forces a build with bfd, i.e. is incompatible with gold and LTO, which means fuzz-introspector won't work: https://github.com/google/oss-fuzz/pull/7573#issuecomment-1100126108

Apr 15 '22 14:04 DavidKorczynski

One solution to enhance statically extracted Call Graph specifcally for indirect calls is using this feature of sancov:

With an additional ...=trace-pc,indirect-calls flag __sanitizer_cov_trace_pc_indirect(void *callee) will be inserted on every indirect call.

What we should do is building the fuzz target with those flags and implement __sanitizer_cov_trace_pc_indirect() to capture the actuall callee of the indirect call at the run-time. Then via running the instrumented fuzz target with the available corpus we can collect the indirect calls callees.

Oct 04 '22 19:10 Navidem