dapptools `dapp mutate`

Description

This adds a simple mutation testing framework to dapptools based on universalmutator.

Mutation testing can be thought of as a kind of reverse fuzzing, where instead of generating random inputs to our test functions, we make random mutations (i.e. introduce bugs) to the source code of the system under test, and then run the test suite to see if the mutation is detected.

The idea is that mutations that are undetected should highlight behaviours that are not well specified by the test suite and hopefully provide some insight to drive improvements to the test suite (or to uncover areas that should be targeted for special attention during audit).

I have been using dss as a testbed while developing, and despite the generally high coverage of the test suite there, dapp mutate was able to uncover many bugs that would have slipped through the test suite (e.g. subtle changes to behaviour in math functions, flipped / altered conditions in important require statements).

The workflow currently looks like this:

Run dapp mutate gen. This will iterate over every non test file under DAPP_SRC and generate mutated versions. Solc is invoked as a part of this process to ensure that the mutated version compiles.
Run dapp mutate filter. This will iterate over every generated mutant and check to see if applying the mutation would be detected by the test suite.

While dapp mutate filter is running, users can call dapp mutate status to get an overview of the current progress, or dapp mutate show-diffs to see the mutations that were not detected by the test suite.

Filter in particular is especially time consuming, and it's probably worth running it overnight or perhaps disabling some particulary time consuming tests. This seems acceptable since users will probably not want to run dapp mutate on every commit / change, but would rather run the analysis periodically to check the status of their test suite.

Still needs docs / changelogs and maybe some tests before merge.

Checklist

[ ] tested locally
[ ] added automated tests
[ ] updated the docs
[ ] updated the changelog

Oct 04 '21 19:10 d-xo

wowwww very cool

I'm wondering if it might be a good idea to break dapp mutate gen, dapp mutate show-diff and dapp mutate filter into separate files, and then make a general convenient dapp mutate --iterations=<number> command which does the generation followed by the filtering, with a flag showing the diff.

Oct 04 '21 19:10 MrChico

ah yes true that would probably be more convenient. We're a little constrained by the interface provided by universalmutator unfortunately (which doesn't e.g. allow for the generation of a specific number of mutants), but I'll have a play around and see if I can make it work...

Oct 04 '21 19:10 d-xo

Filter in particular is especially time consuming, and it's probably worth running it overnight or perhaps disabling some particulary time consuming tests. This seems acceptable since users will probably not want to run dapp mutate on every commit / change, but would rather run the analysis periodically to check the status of their test suite.

ah yes true that would probably be more convenient. We're a little constrained by the interface provided by universalmutator unfortunately (which doesn't e.g. allow for the generation of a specific number of mutants), but I'll have a play around and see if I can make it work...

@gakonst something something rust re-write candidate?

Oct 04 '21 21:10 wminshew

@d-xo very exciting work - thank you for spearheading this. @wminshew it's possible but I haven't done any research on that yet.

Oct 04 '21 22:10 gakonst

which doesn't e.g. allow for the generation of a specific number of mutants

Not sure if you misunderstood me or not, but to clarify: with --iterations I was referring to the number of times the process of generating mutants and filtering them would be repeated, not the number of mutants created.

Oct 05 '21 08:10 MrChico

@gakonst something something rust re-write candidate?

my guess is filtering is expensive purely because it goes over a large number of mutates and runs a bunch of tests, so the only way to seriously speed it up would be to make every test run faster -- which probably isn't a rust rewrite candidate?

(this is the author of universalmutator chiming in)

Oct 05 '21 18:10 agroce

which doesn't e.g. allow for the generation of a specific number of mutants

hmm. one thing to do, that might be better than total random selection, is to 1) generate mutants then 2) prioritize them and cut off at N? analyze_mutants doesn't take an N argument to run N mutants, but it can take a file, and prioritize takes an optional N and can return only the top N mutants, and dump those into a file

Oct 05 '21 18:10 agroce

@gakonst something something rust re-write candidate?

my guess is filtering is expensive purely because it goes over a large number of mutates and runs a bunch of tests, so the only way to seriously speed it up would be to make every test run faster -- which probably isn't a rust rewrite candidate?

(this is the author of universalmutator chiming in)

I am admittedly out of my league here but my interpretation of this thread/demo was that tests already ~are running significantly faster in the [incomplete] rust version? happy to be corrected & have a better understanding of the interplay going forward !

Oct 05 '21 19:10 wminshew

In that case, yes, speeding up tests really really helps, since you basically get N mutants * (all (relevant) tests runtime)

If you break on test failure, of course, that's a maximum and gets better the sooner you detect failures!

Oct 05 '21 19:10 agroce

The main issue for generating only N mutants in universalmutator is that right now they would be for the first <N lines only. I think I could add a mode to randomize the line processing order, if that would be useful?

Oct 05 '21 19:10 agroce

Note that if you're willing to call the mutator many times, one undocumented (and not well tested!) trick is to call the mutator with --fuzz and you will get "back" (at most) one mutant, in fuzz.out. This was for other purposes, but is a quick and dirty way right now to get a random mutant on demand.

Oct 06 '21 21:10 agroce

e.g., to get (maybe) one mutant of a dumb C program, hello.c:

mutate hello.c --cmd "clang hello.c" --fuzz

it'll make one mutant, and if that mutant is invalid, you won't get anything, but if it is valid, you'll get it in fuzz.out

Oct 06 '21 21:10 agroce

Hi @agroce! Thanks so much for the input (and for building this very nice tool in the first place) :sparkling_heart:

e.g., to get (maybe) one mutant of a dumb C program, hello.c:

`mutate hello.c --cmd "clang hello.c" --fuzz

it'll make one mutant, and if that mutant is invalid, you won't get anything, but if it is valid, you'll get it in fuzz.out

This seems like exactly what we need, I'll play around with this and see if I can make it work :pray:

Oct 07 '21 20:10 d-xo

hmm. one thing to do, that might be better than total random selection, is to 1) generate mutants then 2) prioritize them and cut off at N? analyze_mutants doesn't take an N argument to run N mutants, but it can take a file, and prioritize takes an optional N and can return only the top N mutants, and dump those into a file

I hadn't noticed the prioritize command before, what are the heurisitcs used to prioritize one mutant above another?

Oct 07 '21 20:10 d-xo

Complicated -- a mix of location, nature of the change, and the before/after code. It's basically, right now, an ad hoc mess! But it seems somewhat useful, and I have an active NSF grant with @clegoues to make it more principled.

Oct 07 '21 20:10 agroce

which doesn't e.g. allow for the generation of a specific number of mutants

Not sure if you misunderstood me or not, but to clarify: with --iterations I was referring to the number of times the process of generating mutants and filtering them would be repeated, not the number of mutants created.

@MrChico Not sure I quite understand what you mean here. What exactly would happen on each iteration?

Oct 07 '21 20:10 d-xo

I was thinking one iteration is one round of mutate, filter, and that one may do multiple of those. But maybe its not more useful to do mutate, filter, mutate, filter... than to just create more mutations in the first place

Oct 08 '21 10:10 MrChico

dapptools dapptools copied to clipboard

`dapp mutate`

Description

Checklist

dapptools
dapptools copied to clipboard