Grammar-Mutator
Grammar-Mutator copied to clipboard
Idea list
Let's collect some ideas on how to improve the grammar mutator. I am not an expert on this, so some ideas might be not possible, no sense or even makes things worse.
- Use the dictionary with the grammar (-x + LTO AUTODICT feature)
- Increase the tree depth with every new cycle without finds (example on how to pass this to the mutator is in examples/honggfuzz/honggfuzz.c)
- ... ?
Also: document for a mutation which mutation strategies were used, and if it results in a new path, crash or hang, document these away somewhere (fopen("a")... fwrite() ... fclose() would be fine enough), and learn which types are more effective than others, and then try to improve them. maybe weighting, maybe changing how unsuccessful techniques work, etc. (and of course this feature with an #ifdef TESTING or something like that.
pinging @h1994st @andreafioraldi @eqv for more ideas
+1 for gathering statistics on the mutators! If I had the time to change a few things about nautlius2, it would be the follwing:
- Make sure that fuzzing misbehaving targets (e.g. pkill it's parent, rmdir("/") etc) works, because that shows up a lot.
- Allow to build custom generators in some easy scripting language for some/each node. For example this can be used to generate semi-well typed JS, which GREATLY increases the performance of the fuzzer.
+1 for gathering statistics on the mutators
I also listed some enhancements in TODO.md
- Enhance the mutation: incorporate AFL-style mutations into the grammar mutator for havoc stage (Andrea gave me some suggestions)
- Enhance the ability to specify grammars: add regex support for specifying grammar rules
- This has been supported in Nautilus
@eqv A follow-up question to your second bullet point: do you mean to add support of scripting language while specifying the grammar? Like the python-based grammar in nautilus?
No, that stuff is ok and useful in some cases (e.g. to generate XML ), but I was thinking more like:
//assume we have a fuction call rule like this:
//$FunctionCall -> $FunctionName "(" $Args ")"
generate("FunctionCall", lambda ctx: ctx.function_call("foo", "3,"+ctx.generate("Args") ) )
in which case we would have a special generator that generates foo(3, $(random args here) )
Similar generates could be used to generate sensible loops, etc.
I see. Thanks for your explanation! This looks pretty useful.
and learn which types are more effective than others, and then try to improve them. maybe weighting, maybe changing how unsuccessful techniques wor
Basically grammar MOpt.
Use the dictionary with the grammar (-x + LTO AUTODICT feature)
We can try to take a testcase and replace terminal nodes with the dictionary tokens when they are compatible (regex needed).
Use the dictionary with the grammar (-x + LTO AUTODICT feature)
We can try to take a testcase and replace terminal nodes with the dictionary tokens when they are compatible (regex needed).
yes. we need a way to hand over the dictionary to the mutator though. but that is a minor thing, @domenukk thought about passing afl structure in a stable way to mutators already.
Use the dictionary with the grammar (-x + LTO AUTODICT feature)
We can try to take a testcase and replace terminal nodes with the dictionary tokens when they are compatible (regex needed).
yes. we need a way to hand over the dictionary to the mutator though. but that is a minor thing, @domenukk thought about passing afl structure in a stable way to mutators already.
Passing afl structure to the mutators will enhance the capability of the custom mutator a lot.
Or, alternatively, we could expose functions to the custom mutators. This could be a bit more overhead during development, but would mean the mutators can easily be plugged into other fuzzers, like libafl? Else the state struct virtually becomes the api. Both ways are fine though.