fuzz-introspector Discrepancy between reachability and coverage for Python projects

Discrepancy between reachability and coverage for Python projects

Open DavidKorczynski opened this issue 2 years ago • 2 comments

https://oss-fuzz-introspector.storage.googleapis.com/index.html recently got a facelift, and this made some issues obvious. One is that code coverage is often a lot higher than reachability for Python projects. We should investigate why this is the reason and come up with a solution.

Feb 01 '23 17:02 DavidKorczynski

An example: glom has reachability of 13.6% but code coverage of 73.0%

Feb 01 '23 17:02 DavidKorczynski

One of the main reasons for this discrepancy is that the reachability analysis is much more focusesd in comparison to the code coverage analysis. For example, the following fuzzer:

# a lot of code will have runtime coverage in mod1, mod2, mod3 due to imports
import mod1
import mod2
import mod3

import atheris

# zero reachability
def TestOneInput(data):
  return

# Code to trigger atheris
def main():
    atheris.Setup(sys.argv, TestOneInput)
    atheris.Fuzz()

if __name__ == "__main__":
    main()

Will have a lot of code coverage in the modules mod1, mod2 and mod3 since the coverage collection will happen before the modules are imported, and the import statements will cause a lot of code execution in the modules (such as registrering each def in the module) where this is not considered in the reachability analysis, as the reachability only considers that code within the TestOneInput, which in this case will be 0. In this sense, a lot of "code coverage" is in a sense also false code coverage from a fuzzing perspective -- but a lot of the code that is being considered in the code coverage report is not necessarily relevant from a "fuzzing code coverage" perspective.

A couple of things that can be done:

enable code coverage to be started within the fuzzer entrypoint. The negative side of this is that we will not achieve full code coverage unless imports are specified there etc.
make some type of middle-ground where it's clear what code coverage is achieved by the fuzzer versus the e.g. code that happens before the fuzzer executes.
Include reachability analysis based on the full fuzzer file, versus only the fuzzer entrypoint.

Feb 27 '23 19:02 DavidKorczynski

fuzz-introspector fuzz-introspector copied to clipboard

Discrepancy between reachability and coverage for Python projects

fuzz-introspector
fuzz-introspector copied to clipboard