aiohttp icon indicating copy to clipboard operation
aiohttp copied to clipboard

Added first fuzzers as step in OSS-Fuzz integration.

Open DavidKorczynski opened this issue 5 years ago • 7 comments

What do these changes do?

Hi Maintainers,

I would like to set up continuous fuzzing of aiohttp by way of OSS-Fuzz. Essentially, OSS-Fuzz is a free service run by Google that performs continuous fuzzing of important open source projects. OSS-Fuzz recently added support for Python and it would be great to have aiohttp integrated. The only expectation of integrating into OSS-Fuzz is that bugs will be fixed. This is not a "hard" requirement in that no one enforces this and the main point is if bugs are not fixed then it is a waste of resources to run the fuzzers, which we would like to avoid.

In this PR https://github.com/google/oss-fuzz/pull/4764 I have done exactly that, namely created the necessary logic from an OSS-Fuzz perspective to integrate aiohttp. If you would like to integrate, could you please provide a set of email(s) that will get access to the data produced by OSS-Fuzz, such as bug reports, coverage reports and more stats. The emails should be linked to a Google account in order to view the detailed reports and notice the emails affiliated with the project will be public in the OSS-Fuzz repo, as they will be part of a configuration file.

Are there changes in behavior for the user?

No

Related issue number

Fixes #3159.

Checklist

  • [x] I think the code is well written
  • [ ] Unit tests for the changes exist
  • [ ] Documentation reflects the changes
  • [ ] If you provide code modification, please add yourself to CONTRIBUTORS.txt
    • The format is <Name> <Surname>.
    • Please keep alphabetical order, the file is sorted by names.
  • [ ] Add a new news fragment into the CHANGES folder
    • name it <issue_id>.<type> for example (588.bugfix)
    • if you don't have an issue_id change it to the pr id after creating the pr
    • ensure type is one of the following:
      • .feature: Signifying a new feature.
      • .bugfix: Signifying a bug fix.
      • .doc: Signifying a documentation improvement.
      • .removal: Signifying a deprecation or removal of public API.
      • .misc: A ticket has been closed, but it is not of interest to users.
    • Make sure to use full sentences with correct case and punctuation, for example: "Fix issue with non-ascii contents in doctest text files."

DavidKorczynski avatar Dec 07 '20 15:12 DavidKorczynski

Thanks for the response @webknjaz

I would be happy to do those aspects as long as aiohttp would like to integrate into oss-fuzz, do you know if this is the case?

DavidKorczynski avatar Dec 07 '20 18:12 DavidKorczynski

@DavidKorczynski I don't see any reason why not. But let's wait for @asvetlov to weigh in.

webknjaz avatar Dec 07 '20 18:12 webknjaz

Thanks @webknjaz

Notice here that if you prefer to not have the fuzzers in the aiohttp repository then we can also store the fuzzers in the OSS-Fuzz repository for now. However, from a long term perspective it would be great to get fuzzing into the main aiohttp repo.

DavidKorczynski avatar Dec 07 '20 19:12 DavidKorczynski

I support the idea in general but should learn about OSS-Fuzz more. I've started by reading https://github.com/google/oss-fuzz README. From my understanding, fuzzers can help with finding programming errors related to Python C Extensions. Is it correct? Maybe you can point on resources that are good starters? How is it integrated, how can I get a feedback from fuzzing runs, what should we do to write really good and useful fuzzers, what is the fuzzers maintenance procedure, etc., etc.?

Fuzzers can live in aiohttp repo, I see no problem with it. My email is [email protected]

asvetlov avatar Dec 07 '20 20:12 asvetlov

Codecov Report

Merging #5320 (42f0d25) into master (91dadb3) will increase coverage by 0.00%. The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #5320   +/-   ##
=======================================
  Coverage   97.16%   97.17%           
=======================================
  Files          41       41           
  Lines        8739     8768   +29     
  Branches     1402     1404    +2     
=======================================
+ Hits         8491     8520   +29     
- Misses        129      130    +1     
+ Partials      119      118    -1     
Flag Coverage Δ
unit 97.05% <ø> (+<0.01%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
aiohttp/helpers.py 96.68% <0.00%> (-0.21%) :arrow_down:
aiohttp/hdrs.py 100.00% <0.00%> (ø)
aiohttp/streams.py 97.45% <0.00%> (ø)
aiohttp/connector.py 96.51% <0.00%> (ø)
aiohttp/multipart.py 96.27% <0.00%> (ø)
aiohttp/test_utils.py 99.68% <0.00%> (ø)
aiohttp/web_runner.py 97.74% <0.00%> (ø)
aiohttp/web_protocol.py 86.41% <0.00%> (ø)
aiohttp/pytest_plugin.py 97.45% <0.00%> (ø)
aiohttp/web_fileresponse.py 100.00% <0.00%> (ø)
... and 7 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 91dadb3...42f0d25. Read the comment docs.

codecov[bot] avatar Jan 26 '21 20:01 codecov[bot]

@asvetlov

Fuzzers have traditionally been used for native code as a way of catching memory corruption bugs and alike. However, recently there has been a move to more languages, e.g. Go, Rust and Python. When fuzzing pure python apps (no native code) the general bug you will be looking for are unhandled execptions, e.g. if function F promises to throw exceptions from the set X fuzzers can be used to try and break things and determine whether function F can ever throw an exception outside the promised set X. These types of bugs are closer to reliability bugs, but still crucial.

The specific fuzzer we will be using in this context is Atheris: https://github.com/google/atheris This is a good place to start for understanding fuzzing in a Python context.

I also created a small video that shows how the Atheris fuzzer works here: https://www.youtube.com/watch?v=Wjjlk_W7WFo

The way this is integrated is by having a set of files over in the OSS-Fuzz repository, that essentially builds and runs the fuzzers in a continuous manner. Once this gets merged, a way of running this by way of OSS-Fuzz will be the following simple commands:

git clone https://github.com/google/oss-fuzz
cd oss-fuzz
python3 infra/helper.py build_image aiohttp
python3 infra/helper.py build_fuzzers aiohttp
python3 infra/helper.py run_fuzzer aiohttp fuzz_http_parser

The specific fuzzers do not need a lot of maintenance, the main thing will be fixing the bugs when they happen. They way to write really good fuzzers is, from a high-level perspective, writing fuzzers that hit high-level functions such that the coverage-guided aspects of the fuzzer will make it reach far through the code.

You will get a ton of feedback about the status of the fuzzers, e.g. statistics, open crashes, input to trigger crashes, stack traces, fixed crashes, etc. from oss-fuzz.com. You will be able to login there and see all of it. In addition to this you will get emails whenever a crash is found.

We can also integrate the fuzzing into the CI, so the fuzzers will be run for each pull request. This can be hugely beneficial as a way of catching bugs before they are shipped.

DavidKorczynski avatar Jan 26 '21 20:01 DavidKorczynski

Status update following https://github.com/aio-libs/aiohttp/issues/6772#issuecomment-1210415458 is I will be looking revive this PR in the coming days.

DavidKorczynski avatar Aug 10 '22 09:08 DavidKorczynski