syzkaller
syzkaller copied to clipboard
pkg/fuzzer: use a MAB to decide on exec fuzz vs exec gen
I'm currently running experiments to see whether it brings any fuzzing performance improvements.
Cc #1950
Let's try to use a plain delta-epsylon MAB for this purpose.
To better track its effect, also calculate moving averages of the "new max signal" / "execution time" ratios for exec fuzz and exec gen.
Codecov Report
Attention: Patch coverage is 91.66667% with 10 lines in your changes are missing coverage. Please review.
Project coverage is 62.1%. Comparing base (
b90978b) to head (dbd3f2d).
Additional details and impacted files
| Files | Coverage Δ | |
|---|---|---|
| pkg/fuzzer/stats.go | 100.0% <100.0%> (ø) |
|
| pkg/ipc/ipc.go | 52.5% <100.0%> (+0.4%) |
:arrow_up: |
| pkg/fuzzer/fuzzer.go | 85.2% <97.6%> (+3.4%) |
:arrow_up: |
| syz-manager/http.go | 0.0% <0.0%> (ø) |
|
| pkg/learning/mab.go | 94.4% <94.4%> (ø) |
|
| pkg/learning/window.go | 90.3% <90.3%> (ø) |
|
| syz-manager/rpc.go | 0.0% <0.0%> (ø) |
Okay, this particular approach will likely not work well because the "new signal / execution time" ratio can have a pretty large range of values. The bounded learning rate, even being as small as 0.0005, would not cope with extreme outliers.
I'm currently testing how useful it may be if we just look at the probability of generating any new coverage. This is the easiest implementation-wise.
The "new signal / execution time" ratio itself still sounds like a much better target, but we need to map it to a smaller range of values to use as a reward for a MAB. Perhaps keep some history of values of the last N "new signal / execution time" values and see what share of them is less than the current observed value? Then it would nicely map to [0;1].
The only caveat is that if N is large, we might need some tricky data structures to determine this rate efficiently (balanced trees should be able do it in O(log N), not sure if we can do it better?).
Maybe we could map coverage and execution time to some logarithmic buckets: 1, 10, 100, or finer-grained 1,2,5,10,20,50,100...? Also amount of coverage may be very different for different syscalls, some may give some huge chunks of coverage, and some just few coverage points.