fuzzbench Sancov based ctx/ngram LibAFL fuzzers

Hello, fuzzbench team.

We implemented ctx and ngram coverage based on sanitizer coverage. (libafl_ctx_large_map, libafl_ctx_mid_map, libafl_ctx_small_map, libafl_ngram_large_map, libafl_ngram_mid_map, libafl_ngram_small_map) The previous implementation was based on AFL's llvm pass which has negative impact on performance.

Therefore we want to compare how this new implementation compares to the baseline. Both ctx, ngram has 3 variants depending on the map size we use.

Feb 21 '24 18:02 tokatoka

This one is ready

Ths command is /gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-02-23-libafl --fuzzers libafl_ctx_large_map libafl_ctx_mid_map libafl_ctx_small_map libafl_ngram_large_map libafl_ngram_mid_map libafl_ngram_small_map

@DonggeLiu Could you run this experiment? :)

Feb 23 '24 12:02 tokatoka

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-02-26-libafl --fuzzers libafl_ctx_large_map libafl_ctx_mid_map libafl_ctx_small_map libafl_ngram_large_map libafl_ngram_mid_map libafl_ngram_small_map

Feb 25 '24 23:02 DonggeLiu

Ops, could you please add a dummy change to enable PR experiments? Thanks :)

Feb 25 '24 23:02 DonggeLiu

hi i made the change also i added the more recent libafl to compare as the baseline

the command would be

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-02-23-libafl --fuzzers libafl_ctx_large_map libafl_ctx_mid_map libafl_ctx_small_map libafl_ngram_large_map libafl_ngram_mid_map libafl_ngram_small_map libafl_280224

Feb 28 '24 15:02 tokatoka

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-02-29-libafl --fuzzers libafl_ctx_large_map libafl_ctx_mid_map libafl_ctx_small_map libafl_ngram_large_map libafl_ngram_mid_map libafl_ngram_small_map libafl_280224

Feb 28 '24 21:02 DonggeLiu

Hello @DonggeLiu Could you do another run?

In this change, we

- make LibAFL fuzzers base on the same commit as #1840 so we can make a fair comparison (sorry previous run was based on the latest LibAFL, it is my fault :man_facepalming: 
- add value-profile (named as libafl_fuzzbench_vp_alter) and ngram-8 fuzzers, both are sancov based new implementation

The command is

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-03-06-libafl --fuzzers libafl_fuzzbench_ngram4 libafl_fuzzbench_ngram8 libafl_fuzzbench_ctx libafl_fuzzbench_vp_alter

Mar 06 '24 18:03 tokatoka

~~Thanks @tokatoka, unfortunately we are making some change to FuzzBench at this moment and may need a few days before it is ready for experiments :)~~

It seems to be ready now, I will do it below.

Mar 11 '24 03:03 DonggeLiu

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-03-11-libafl --fuzzers libafl_fuzzbench_ngram4 libafl_fuzzbench_ngram8 libafl_fuzzbench_ctx libafl_fuzzbench_vp_alter

Mar 11 '24 03:03 DonggeLiu

Experiment 2024-03-11-libafl data and results will be available later at: The experiment data. The experiment report.

Mar 11 '24 03:03 DonggeLiu

this is done. thank you! 👍

Apr 12 '24 15:04 tokatoka

Hi @DonggeLiu. This is not the preparatory test for the long-run experiment that I was talking about last week, but it is additional experiment for our fuzzer comparison paper

In this experiment we want to evaluate the degree of two fuzzer's component's interference. (like if the component X and component Y of the fuzzer will have better or worse interaction)

Can you run the experiment for the next 5 fuzzers? The command is

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-04-17-libafl --fuzzers libafl_fuzzbench_fast libafl_fuzzbench_fast_ngram4 libafl_fuzzbench_fast_value_profile libafl_fuzzbench_ngram libafl_fuzzbench_value_profile

Apr 17 '24 12:04 tokatoka

Sure, could you please fix the presubmit failure? Thanks!

Apr 17 '24 22:04 DonggeLiu

done!

Apr 19 '24 11:04 tokatoka

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-04-22-libafl --fuzzers libafl_fuzzbench_fast libafl_fuzzbench_fast_ngram4 libafl_fuzzbench_fast_value_profile libafl_fuzzbench_ngram libafl_fuzzbench_value_profile

Apr 22 '24 03:04 DonggeLiu

The experiment failed to launch because of invalid fuzzer name. Maybe libafl_fuzzbench_ngram should be libafl_fuzzbench_ngram4 or libafl_fuzzbench_ngram8?

Apr 22 '24 04:04 DonggeLiu

I'm sorry 😔 It is libafl_fuzzbench_ngram8

so /gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-04-22-libafl --fuzzers libafl_fuzzbench_fast libafl_fuzzbench_fast_ngram4 libafl_fuzzbench_fast_value_profile libafl_fuzzbench_ngram8 libafl_fuzzbench_value_profile

Apr 22 '24 10:04 tokatoka

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-04-23-libafl --fuzzers libafl_fuzzbench_fast libafl_fuzzbench_fast_ngram4 libafl_fuzzbench_fast_value_profile libafl_fuzzbench_ngram8 libafl_fuzzbench_value_profile

Apr 22 '24 23:04 DonggeLiu

Hi. I adjusted the map size because previously it was using a map that was too large.

Can you run it again along with an additional fuzzer of libafl_fuzzbench_ngram4 ?

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-04-4-libafl --fuzzers libafl_fuzzbench_fast_ngram4 libafl_fuzzbench_ngram4

Apr 24 '24 12:04 tokatoka

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-04-24-libafl --fuzzers libafl_fuzzbench_fast_ngram4 libafl_fuzzbench_ngram4

Apr 24 '24 13:04 DonggeLiu

Hi @DonggeLiu , I waited for a few days, for the 2024-04-24-libafl experiment but the results do not look complete. I downloaded the csv from here https://www.fuzzbench.com/reports/experimental/2024-04-24-libafl/data.csv.gz. However, the result does not contain the result for 24 hours. and most experiments ends in 17.5 hours. I suspect it is because of out of resource since there was another different experiment conducted on the same day (2024-04-24-full-muttfuzz).

Could you run this again for me? thanks

May 02 '24 15:05 tokatoka

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-05-03-libafl --fuzzers libafl_fuzzbench_fast_ngram4 libafl_fuzzbench_ngram4

May 03 '24 00:05 DonggeLiu

Hi @DonggeLiu , I waited for a few days, for the 2024-04-24-libafl experiment but the results do not look complete. I downloaded the csv from here https://www.fuzzbench.com/reports/experimental/2024-04-24-libafl/data.csv.gz. However, the result does not contain the result for 24 hours. and most experiments ends in 17.5 hours. I suspect it is because of out of resource since there was another different experiment conducted on the same day (2024-04-24-full-muttfuzz).

Thanks for reporting this, @tokatoka! I've re-started the experiment in case it is flaky. Usually, we expect the experiment to finish within four days. Please feel free to ping me if it takes longer.

While waiting for the new experiment to complete, did you have a chance to look into the run log of your experiment? Sometimes, the fuzzer's run log may have clues, too. I am curious to learn the reason and see if there is anything we can improve.

May 03 '24 00:05 DonggeLiu

While waiting for the new experiment to complete, did you have a chance to look into the run log of your experiment? Sometimes, the fuzzer's run log may have clues, too.

For the LibAFL log, we do not log anything, so I can only guess by the last corpus update time.

These days I almost always see an weird restart (or you could say vm preemption) with the results. That is,

First the experiments starts.
Then after 10 ~ 12 hours, suddenly most instances stop, and then the result page will also revert back.
After that the instances start again, this time they will run for 23 hours.

For me, I'm fine as long as they give the complete result in 3. However 3. didn't happen this time, as long as i can tell from the result, they didn't finish its 23-hours run. When I quickly look through these results. https://storage.googleapis.com/fuzzbench-data/index.html?prefix=2024-04-24-libafl/experiment-folders/ then the instances with bigger trial id (these are the restarted instances in the above 3.) somehow all stopped fuzzing around on 26th April 1PM to 2PM. So I would assume something has happend during that period.

May 03 '24 15:05 tokatoka

And on the other hand, https://storage.googleapis.com/fuzzbench-data/index.html?prefix=2024-04-24-full-muttfuzz

Oddly here, the last corpus time is at latest around 26th April's morning. So in theory, there should be no "out of resource" at the time of 26th April's afternoon. Because at this time, this experiment is already finished

May 03 '24 15:05 tokatoka

However 3. didn't happen this time, as long as i can tell from the result, they didn't finish its 23-hours run. When I quickly look through these results. https://storage.googleapis.com/fuzzbench-data/index.html?prefix=2024-04-24-libafl/experiment-folders/ then the instances with bigger trial id (these are the restarted instances in the above 3.) somehow all stopped fuzzing around on 26th April 1PM to 2PM. So I would assume something has happend during that period.

Yes, this is indeed strange why no trial instances are running:

There is no trial instance of 2024-04-24-libafl, but the main dispatcher VM still exists:

NAME                         STATUS      CREATION_TIMESTAMP   PREEMPTIBLE  ZONE
d-2024-04-24-libafl          RUNNING     2024-04-25T00:03:46  FALSE        us-central1-c

The report shows the experiment has not finished: https://storage.googleapis.com/www.fuzzbench.com/reports/experimental/2024-04-24-libafl/index.html
The gcloud log shows it has not finished either:

I will kill d-2024-04-24-libafl and let's see how the new experiment goes.

May 04 '24 00:05 DonggeLiu

BTW, this code ensures trial instances will eventually complete within 2 days: https://github.com/google/fuzzbench/blob/162ca0c277054e479128151ff8b0cdb87727bfac/experiment/scheduler.py#L301-L312

If a trial was preempted after 1 day, then a non-preemptible instance will be used. Not sure why they did not produce result in the last exp, though.

May 04 '24 00:05 DonggeLiu

It's complete thank you 👍

May 06 '24 11:05 tokatoka