fuzzilli
fuzzilli copied to clipboard
Implement More Intelligent Mutation/Code Generator/Constant Selection Scheduler
Fuzzilli current uses hard-coded weights to select between the various mutators, code generators, and constants. These weights are just approximations, and a more intelligent mutator scheduler could help improve performance, by giving additional weight to those likely to find new coverage, or be successful.
One possible example would be something like MOpt, although it may be overkill at this time.
First thoughts on requirements for a new scheduler:
- Applies more emphasis to mutators/code generators/constant selectors that are more likely to run successfully or produce new coverage
- Have a "weight floor", intermittent re-normalization, etc, such that some no mutator/code generator/constant is ever completely ignored
Great idea! Some more initial thoughts:
- Besides the weights of the mutators, it would also be nice to automatically determine a somewhat optimal "aggressivity" for each mutator, i.e. how many changes are performed by one invocation of the mutator
- While (as you already noted) the success rate could also be used as a second input to the algorithm besides the "interesting" rate, I think it's quite possible that a mutator with a fairly high "failure" rate (producing lots of semantically invalid samples) actually has a very good "interesting" rate. Probably something to evaluate later on
Hi. I am working on this but I am quite new to developing Fuzzilli. Should I make a WIP PR so I can get more feedback? My code is now probably really messy 😅Thank you!
Hey! Sure, feel free to open a WIP PR for early feedback if you think it helps :)
Hi, I was wondering how the current weights are manually determined. Is there a process that you have used to calculate these?
For the CodeGenerators, the weights mostly just depending on how "relevant" the generated features are. For example, binary operations are probably more interesting on average than instanceof. For mutators, the weights are also set to keep the correctness rate somewhere between 50-75%, see also https://github.com/googleprojectzero/fuzzilli/blob/main/Sources/Fuzzilli/Mutators/MutatorSettings.swift But it's all pretty arbitrary...
Thank you for the insight into this, I am trying to formulate an algorithm that would dynamically rebalance the weights for the code generators. I am considering a heuristic based approach that favors coverage, is that alone a good measure of success? Are there any other heuristics that should be considered?
Yeah it's a tricky question, see e.g. https://github.com/googleprojectzero/fuzzilli/blob/main/Docs/HowFuzzilliWorks.md#limitations-of-the-mutation-engine Ideally, the metric would be newly found crashes. However, in practice that doesn't really work as fuzzing is just too unpredictable and there are (usually) way too few crashes (so there's not enough data). So I think in short, just use coverage as a proxy metric, then try to evaluate how well that actually works with large (>= 1G iterations) fuzzing runs afterwards. If total coverage increases, or is found more quickly, then that's probably a good sign. Also of course any new crashes.