go-fuzz
go-fuzz copied to clipboard
Q: Why does restarting/recompiling often generate corpus?
When running fuzz tests, I very often come across cases where it will run for hours without generating additional cases:
2019/08/19 09:18:10 workers: 11, corpus: 551 (9h53m ago), crashers: 0, restarts: 1/9859, execs: 12274551 (344/sec), cover: 1378, uptime: 9h55m
However, when restarting, it will generate new cases, usually within minutes:
2019/08/19 09:24:35 workers: 11, corpus: 551 (4m36s ago), crashers: 0, restarts: 1/1016, execs: 11183 (41/sec), cover: 1378, uptime: 4m36s
2019/08/19 09:24:38 workers: 11, corpus: 552 (1s ago), crashers: 0, restarts: 1/1025, execs: 11280 (40/sec), cover: 1378, uptime: 4m39s
To me (no expert) this seems like a missed opportunity, and I am wondering if it is possible to do this without having to manually restart the process?
I wish I knew! I’ve observed the same thing and spent some time trying to understand why. I don’t yet. My best theory is that it has to do with the reminimization that occurs on startup, but I don’t have concrete evidence for or an explanation of that.
Somewhat related...
I have some mutations implemented locally that are expensive but effective. I’ve been thinking for a while about monitoring coverage and if progress stalls for a while, switch into a “rattle-the-cage” mode where we do a bunch of expensive mutations to try to reach new territory. This is not an unfamiliar approach: I read a paper about folks using a SAT solver (way more expensive!) in such a way.
I also tried adding tracing so that I could tell what mutations introduced new coverage. (This is trickier than it sounds; the code base is not set up for it.) The idea was then to treat the problem as a multi-armed bandit, re-weighting the mutations over time to favor the effective ones. Then you can throw in anything. I never got it fully working, but my initial experiments were not positive. Although I did learn that of the two dozen mutations in use, a handful are responsible for the vast majority of the new coverage.
I’m happy to share some more details about my more promising experiments if someone is interested in doing some work to getting them integrated. Otherwise, I’ll eventually finish my slow process of cleaning up the worthwhile ones and upstreaming them incrementally.
Yes, it's frequently quite hard to understand what exactly happens inside of a fuzzer and why. Minimization is a possible cause, which suggests that if we implement the "corpus rotation" idea in go-fuzz: https://github.com/google/syzkaller/issues/1348 it should fix this and improve fuzzer beyond that. I've seen a number of indirect and almost direct signals suggesting that such corpus rotation should be very profitable.
Re multi-armed bandit, I think we need to start with some framework that allows to more explicitly distribute interations (time?) across multiple strategies and with statically assigned priorities. The second (optional) step would be to learn/re-learn the probabilities. But if you ask me, I am always ready to give some moderate amount of interations to any new good strategy b/c behavior on infinite time is more important in the end (esp in the context of continuous fuzzing).
If you read the AFL mailing list, there are lots of patches /papers that improve AFL's coverage fuzzing in certain cases. It might be interesting to see which of those we would want to import into go-fuzz.