fuzzbench icon indicating copy to clipboard operation
fuzzbench copied to clipboard

Generate segment and function coverage data over time for an experiment

Open BharathMonash opened this issue 4 years ago • 14 comments
trafficstars

As suggested, opening a new PR for easier review. I have maintained a seperate file for the code added for generating function and segment coverage and same goes for the unit tests as well.

Linking the previous PR to retain all our conversations :) Previous PR: #832

PR is about generating 3 compressed CSV files recording the segment and function coverage over time. The CSV files (headers mentioned alongside) being generated are: (NOTE: These CSV would be available in a compressed (.gz) format)

1. segments.csv (only covered segments) - (benchmark_id, fuzzer_id, trial_id, time_stamp, file_id, line, col)
2. functions.csv - (benchmark_id, fuzzer_id, trial_id, time_stamp, function_id, hits)
3. names.csv  - (id, name, type)

Filestore path for these files: experiment_data/$(EXPERIMENT_NAME)/coverage/data/{files above}.csv.gz

The segment and function coverage is being recorded while measuring snapshots (every 900 secs), hence the time_stamp column. The generated data captures the concrete code elements that are covered over time. For functions, we also maintain hit counts. For segments, we only record those that have been covered and add the newly covered segments for later time stamps. This makes it space efficient alongside compression.

This data is generated over the entire campaign at every snapshot point when coverage is measured but the CSV files are only available after the experiment ends.

Experiment config (local):
        benchmark: libjpeg-turbo-07-2017, freetype2-2017
        fuzzers: libfuzzer, afl
        trials: 2 (each)
        max_total_time: 1860 (2 cycles)

Output files size for compressed CSVs:
        segment.csv.gz:  230.5 kB
        functions.csv.gz: 63.6 kB
        names.csv.gz:  32.0 kB

Based on the experiment results shown above, if we have to estimate the sizes on a full experiment, it can be calculated as: ((size_of_segment.csv.gz + (size_of_function.csv.gz * measure_cycles * trials)) * fuzzers + size_of_name.csv.gz) * benchmarks

So for a 24-hour experiment (96 cycles), 20 benchmark, 15 fuzzers and 10 trials the estimated size is ~ 740 mB. Due to certain repetitive patterns in the data, the final compressed sized may even result to lower than 740 mB as compression engines give a better compression rate if the records are similar in a way.

Attaching all the generated files here for your reference. functions.csv.gz names.csv.gz segments.csv.gz

BharathMonash avatar Dec 15 '20 06:12 BharathMonash

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

google-cla[bot] avatar Dec 15 '20 06:12 google-cla[bot]

@googlebot I consent.

mboehme avatar Dec 15 '20 10:12 mboehme

Thanks @jonathanmetzman. I have made a few changes as requested. Let me come up with a better name for the module and come up with concrete test cases as well. Any suggestions from your side for the test cases are welcome :)

BharathMonash avatar Dec 21 '20 12:12 BharathMonash

Hi @jonathanmetzman, hope this note finds you well. This is just a gentle reminder to review the requested changes :)

BharathMonash avatar Jan 12 '21 07:01 BharathMonash

Hi @jonathanmetzman, hope this note finds you well. This is just a gentle reminder to review the requested changes :)

Sorry @BharathMonash I'll try to get to this today

jonathanmetzman avatar Jan 12 '21 14:01 jonathanmetzman

Hi @jonathanmetzman, hope this note finds you well. This is just a gentle reminder to review the requested changes :)

Sorry @BharathMonash I'll try to get to this today

Thanks @jonathanmetzman, completely at your convenience. Sorry, I didn't mean to rush you.

BharathMonash avatar Jan 13 '21 01:01 BharathMonash

@inferno-chromium could you also please take a look and share your thoughts on this?

jonathanmetzman avatar Jan 14 '21 02:01 jonathanmetzman

Thanks, @jonathanmetzman. I have incorporated the requested changes. I have left a few comments on name_to_id stuff to explain why we are using it in detail. https://github.com/google/fuzzbench/pull/987/files/aa765473f4417e96ab7eef93275021589e67a148#r557080634 Let me know what you think about it or if you have a better idea :)

BharathMonash avatar Jan 14 '21 06:01 BharathMonash

Thanks for the green light @jonathanmetzman! We haven't been able to trial this PR at Fuzzbench scale. Would suggest to run a Fuzzbench-scale experiment on this branch before merging.

mboehme avatar Jan 15 '21 23:01 mboehme

Thanks for the green light @jonathanmetzman! We haven't been able to trial this PR at Fuzzbench scale. Would suggest to run a Fuzzbench-scale experiment on this branch before merging.

I will run a trial official experiment using this PR over the weekend. Also, will try to review soonish.

inferno-chromium avatar Jan 15 '21 23:01 inferno-chromium

Thanks @jonathanmetzman and @inferno-chromium. I hope that the trial run is successful 🤞.

BharathMonash avatar Jan 16 '21 00:01 BharathMonash

Thanks @jonathanmetzman and @inferno-chromium. I hope that the trial run is successful 🤞.

Experiment failing on Error occurred during measuring. https://www.fuzzbench.com/reports/experimental/2021-01-15/index.html

2021-01-16 14:11:33.837 PST
Error occurred during measuring.
Expand all | Collapse all{
 insertId: "1ggan0tfu029ej"  
 jsonPayload: {
  component: "dispatcher"   
  experiment: "2021-01-15"   
  instance_name: "d-2021-01-15"   
  message: "Error occurred during measuring."   
  subcomponent: "measurer"   
  traceback: "Traceback (most recent call last):
  File "/work/src/experiment/measurer/measure_manager.py", line 112, in measure_loop
    if not measure_all_trials(experiment, max_total_time, pool,
  File "/work/src/experiment/measurer/measure_manager.py", line 148, in measure_all_trials
    manager.list(  # pytype:disable=attribute-error
  File "/usr/local/lib/python3.8/multiprocessing/managers.py", line 740, in temp
    token, exp = self._create(typeid, *args, **kwds)
  File "/usr/local/lib/python3.8/multiprocessing/managers.py", line 623, in _create
    conn = self._Client(self._address, authkey=self._authkey)
  File "/usr/local/lib/python3.8/multiprocessing/connection.py", line 502, in Client
    c = SocketClient(address)
  File "/usr/local/lib/python3.8/multiprocessing/connection.py", line 630, in SocketClient
    s.connect(address)
ConnectionRefusedError: [Errno 111] Connection refused

coming from this line

    # Multiprocessing list to store all trial-specific detailed_coverage_data.
    trail_specific_coverage_data_list = (
        manager.list(  # pytype:disable=attribute-error
            [detailed_coverage_data]))

inferno-chromium avatar Jan 16 '21 22:01 inferno-chromium

Sorry about that. We'll investigate. Worst case, we'll trade memory for disk storage and write the trial specific coverage data to the (trial specific) subject-fuzzer/trial folder to handle concurrency in the file system and post-process sequentially. This would also be less invasive for the measurer and simplify unit testing.

mboehme avatar Jan 16 '21 23:01 mboehme

Sorry about that. We'll investigate. Worst case, we'll trade memory for disk storage and write the trial specific coverage data to the (trial specific) subject-fuzzer/trial folder to handle concurrency in the file system and post-process sequentially. This would also be less invasive for the measurer and simplify unit testing.

No worries at all, it is completely hard to predict this fat measurer. Sometime in Q2, we will split it, right now focusing on bug benchmarks. For the CL, using disk sounds totally good, ton of disk space and we can even add more as needed.

inferno-chromium avatar Jan 17 '21 00:01 inferno-chromium