consensus-specs Implement some form of mutation testing

Something I would like to do: perform mutation testing on the specs. For all Electra spec functions, “disable” one conditional, then run the spec tests. If the tests pass, that means we’re missing a test. Repeat for all conditionals. An evolution of this might use a mutation library (eg https://mutmut.readthedocs.io/en/latest/) but that’s more complicated.

Apr 15 '25 13:04 jtraglia

Interesting, why do you think code coverage report wouldn’t be enough to find the gaps that could be found by that approach?

Apr 15 '25 13:04 mkalinin

Interesting, why do you think code coverage report wouldn’t be enough to find the gaps that could be found by that approach?

Mutation coverage should be more detailed/specific than code coverage. I.e. a test touching a particular statement or branch doesn't necessarily "kills" a mutant associated with the statement/branch. See for example Mutation testing overview. Additionally, mutation testing typically involves more mutation operators than merely "disabling" a conditional, see e.g. Mutation operators.

One more aspect is that code coverage tools can be limited, e.g. it can measure statement or branch coverage, but cannot measure condition/predicate coverage. So, "disabling" conditionals could provide an alternative.

Apr 15 '25 16:04 ericsson49

What Alex said + it's essentially what Michael Sproul did here to find this missing test case:

https://github.com/ethereum/consensus-specs/issues/4257

Apr 15 '25 17:04 jtraglia

mutmut is a good low-effort catch-all, with the caveat that it doesn't support seeding — but since it uses deterministic AST traversal, re-running it on the same code will produce the same mutations in the same order (as long as the code doesn't change).

It might be easiest to start with mutmut to quickly identify tests that don’t kill mutations, then follow up with a simple AST walker to target specific forks or functions more precisely.

Jul 25 '25 10:07 moodmosaic