oss-fuzz coveragepy: initial integration

Aug 05 '22 16:08 DavidKorczynski

I am very interested in this, but I don't yet understand what I am looking at! :) Can you write up some details that can go into the coverage.py development/contributing docs? What sorts of errors is this looking for? What do the failures look like?

Aug 16 '22 15:08 nedbat

I am very interested in this, but I don't yet understand what I am looking at! :) Can you write up some details that can go into the coverage.py development/contributing docs? What sorts of errors is this looking for? What do the failures look like?

Thanks @nedbat and thanks for the email too! I continued the conversation in the coveragepy PR

Aug 16 '22 19:08 DavidKorczynski

I don't know atheris at all. From reading this, it looks like we ask Atheris for completely random data, and then try to parse it as Python? So it will 99.99999% of the time be a failed parse, which is fine, and we're looking for other kinds of failures?

I'm wondering about other ways to use fuzzing. I use Hypothesis for specific property-based testing in a small corner of the test suite, but haven't had other ways to use it, since I was stuck on the question of how to generate random Python programs.

Aug 16 '22 22:08 nedbat

From reading this, it looks like we ask Atheris for completely random data, and then try to parse it as Python?

I think it's more accurate to consider Atheris as a genetic mutational algorithm rather than completely random data. Particularly, it will use runtime instrumentation to measure the code coverage of a given input and then accummulate a set of inputs that collectively explore the code code of coveragepy. This has significant impact on the effectiveness of fuzzing vs completely random data.

OSS-Fuzz supports code coverage visualisation, which means that once integrated we will be able to assess how much of the coveragepy code has been explored by the fuzzers. We can use this to guide us towards extending the fuzzing suite in order to reach optimal code coverage. This is the first big milestone I usually pursue when working on a project.

We can do other avenues as well, such as more property oriented testing by way of fuzzing. In this context I'd need to know a bit more about coveragepy codebase to suggest some interesting directions.

Aug 17 '22 09:08 DavidKorczynski

@nedbat the first issue has now been reported. Since this is the first issue I wrote a few extra comments to help you interpret the information. See here: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=50381#c1

I added some follow-up comments, notice in particular: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=50381#c3 :)!

Sorry for the spam @nedbat -- managed to create an accurate producer in the terminal: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=50381#c4

Aug 18 '22 13:08 DavidKorczynski

Thanks, but that link asks me to login, using my mostly behind-the-scenes gmail account name, then tells me permission denied.

Aug 19 '22 02:08 nedbat

Meanwhile, I can read https://oss-fuzz.com/testcase-detail/5485653491580928

Aug 19 '22 02:08 nedbat

I guess you have to authorize [email protected]

Aug 19 '22 10:08 nedbat

That's odd: I haven't experience being able to see the detailed reports and not Monorail issues -- usually it's the other way around.

I used the email provided here as the account needed for login https://github.com/nedbat/coveragepy/issues/1436#issuecomment-1216792798

Notice, that email should be the primary email on the affiliated Google account.

Aug 19 '22 10:08 DavidKorczynski

I guess you have to authorize [email protected]

Will do!

Aug 19 '22 10:08 DavidKorczynski