mutmut 3.3.0 does not mutate decorated functions/methods
It might be something about the way I am doing it, but I cannot get mutmut 3.3.0 to mutation test more than one files in this codebase/branch.
Directory structure is...
├── _starter_kit
│ ├── src
│ └── tests
├── fox_goose_corn
│ ├── src
│ └── tests
├── list_things
│ ├── src
│ └── tests
└── thing_rental
├── src
└── tests
Relevant configuration is...
[tool.mutmut]
paths_to_mutate = [
"_starter_kit/",
"fox_goose_corn/",
"list_things/",
"thing_rental/"
]
do_not_mutate = [
"**/tests/*"
]
[tool.pytest.ini_options]
addopts = "--ignore ./mutants"
The presence or not of --ignore ./mutants does not have any effect on the behaviour of mutmut, just stops pytest running tests in the mutmut directory if I fire it off from the root to run all tests.
Run mutmut gives...
➜ rm -rf mutants && poetry run mutmut run
⠦ Generating mutants
done in 133ms
⠇ Running stats
done
⠋ Running clean tests
done
⠏ Running forced fail test
done
Running mutation testing
⠏ 9/9 🎉 9 🫥 0 ⏰ 0 🤔 0 🙁 0 🔇 0
90.25 mutations/second
poetry run mutmut browse only shows any action on fox_goose_corn/src/model/boat.py...
_starter_kit/__init__.py 0 0 0 0 0 0 0 0 0
_starter_kit/src/__init__.py 0 0 0 0 0 0 0 0 0
_starter_kit/src/thing.py 0 0 0 0 0 0 0 0 0
fox_goose_corn/__init__.py 0 0 0 0 0 0 0 0 0
fox_goose_corn/src/__init__.py 0 0 0 0 0 0 0 0 0
fox_goose_corn/src/crossing_manager.py 0 0 0 0 0 0 0 0 0
fox_goose_corn/src/model/boat.py 0 0 0 0 0 0 0 9 0
fox_goose_corn/src/model/cargo_item.py 0 0 0 0 0 0 0 0 0
fox_goose_corn/src/model/river.py 0 0 0 0 0 0 0 0 0
list_things/__init__.py 0 0 0 0 0 0 0 0 0
list_things/src/__init__.py 0 0 0 0 0 0 0 0 0
list_things/src/continuous_subset.py 0 0 0 0 0 0 0 0 0
list_things/src/scattered_subset.py 0 0 0 0 0 0 0 0 0
thing_rental/__init__.py 0 0 0 0 0 0 0 0 0
thing_rental/src/__init__.py 0 0 0 0 0 0 0 0 0
thing_rental/src/exceptions/__init__.py 0 0 0 0 0 0 0 0 0
thing_rental/src/exceptions/person_already_renting_exception.py 0 0 0 0 0 0 0 0 0
thing_rental/src/exceptions/thing_already_rented_exception.py 0 0 0 0 0 0 0 0 0
thing_rental/src/model/__init__.py 0 0 0 0 0 0 0 0 0
thing_rental/src/model/person.py 0 0 0 0 0 0 0 0 0
thing_rental/src/model/rental.py 0 0 0 0 0 0 0 0 0
thing_rental/src/model/thing.py 0 0 0 0 0 0 0 0 0
thing_rental/src/rental_service.py 0 0 0 0 0 0 0 0 0
In case it helps, here is the mutmut-stats.json...
The problem seems to be that all the functions/classes in these files are decorated (e.g. with @typechecked). At least with v3.3.0 these are ignored by mutmut, maybe this is too restrictive?
https://github.com/boxed/mutmut/blob/b124c6ad46b0a9a048b5e47c8b7f93d33a9f2e6a/mutmut/file_mutation.py#L152-L157
The reasoning is that (most) decorators are evaluated at import time, so removing/copying/changing them could break the whole pytest test collection (and thus all mutations). For instance, when decorating functions with @app.post("/foo"), the app could/should complain if you have this annotation multiple times. Currently mutmut skips these functions. If it would try to mutate the decorated function, it would create a copy of it for every mutation and thus we would have many @app.post("/foo").
Maybe a allow/deny list approach could work, where you specify for which decorators we can mutate the decorated functions? Or we keep a deny list of common problematic decorators (not sure which that would be) and only skip those.
You are correct. I added @typechecked to fox_goose_corn/src/model/boat.py and it had two effects:
- I could drop a test checking the type of the input passed to a method because it's covered by
@typechecked. - Now mutmut doesn't mutate anything.
I think being able to specify an ignore list would be good, perhaps defaulting it to decorators we discover do not usually cause problems.
I will have a poke at that if/when I get some time.
I have also edited the title of the issue to be more helpful to other people.
Not sure if it's better, but another approach to support mutation of arbitrary decorated functions would be to change our mutation setup of these methods, as highlighted in this comment: https://github.com/boxed/mutmut/issues/376#issuecomment-2799877261
Currently, if we want to mutate n parts of a function body, we create n copies of this function. Then, in a mutation run we look at an environment variable to select which mutated function is run. For decorated functions, we could instead keep the one function, but use some conditionals inside of the function to enable the mutation of the test run.
The allow/deny-listing of mutations is much simpler to do. The mutation setup change would be interesting as it could support practically all decorated functions, but requires more implementation effort and I am not 100% sure that it works well and is performant.
Based on my time availability, or lack thereof, I will stick with the allow/deny approach as a first step.
My initial spike raises questions.
This python-sandpit branch is set up to use this mutmut hacked to allow typechecked branch.
mutmut mutates typed things that are covered by typechecked, so there is no value in mutating them.
Adding tests to cover them would negate one of the benefits of enforcing the types, which is not having to write as much code. typechecked won't allow it anyway, which is as it should be.
So I guess to be able to use mutmut on a strictly typed Python codebase, regardless of how the typing is enforced, we would need mutmut to not mutate things to be something not allowed by their type.
I wonder whether the best approach would be to have a flag in mutmut to allow us to tell it to adhere to the type hints when mutating? That would work well enough if a whole codebase is type enforced, but maybe not for something transitioning that currently has a mix.
Maybe we need to get mutmut to always adhere to type hints when mutating code where the types are enforced, starting with typechecked, then rolling on to other ways types can be enforced? That feels like it could get pretty hairy.
It does feel a bit sad that people who want to use strict typing and mutation testing, cannot combine these approaches in the same codebase.
mutmut mutates typed things that are covered by typechecked, so there is no value in mutating them.
I see your point. typechecked should detect assignments of the wrong type, so why bother mutating something to a wrong type if it will throw an error for sure? That's likely wasted performance. Though if the code is covered (i.e. it is run during your tests; which it should for testing your business logic), these mutations will be killed. So mutating to wrong types results in mutants that always gets killed, which slows down the testing process.
Adding tests to cover them would negate one of the benefits of enforcing the types, which is not having to write as much code. typechecked won't allow it anyway, which is as it should be.
I think you should add tests that cover your business logic. You should not need tests that verify the types, the mutants will be killed anyway because of typechecked. Looking at your project, the only mutants that survive are in code without any test coverage. So increasing your test coverage would get rid of those mutants as a side effect, without directly unit testing for types.
In general, combining mutmut with typechecking is an interesting idea and could improve mutmut, though it's at least not straightforward to do with a lot of open questions. Of the top of my head, I would have following concerns:
- I'm not sure if implementing simple type heuristics would be helpful enough, or if we need a full typechecker (mypy, pyright, etc.)
- different typecheckers have different results, there is no "right" type checking
- currently type checking is pretty slow, we likely don't want to run it for every mutation
"Adhering to type hints" seems tricky to me. LibCST has a method to get the inferred types (https://libcst.readthedocs.io/en/latest/metadata.html#libcst.metadata.TypeInferenceProvider, using the pyre type checker), but this does not tell us if changing a value would break the type system (e.g. a = 'foo' would resolve to the inferred type str. If we mutate it to a = None, depending on the usage of a it could still pass the type checks, now being inferred to type None, or it could fail the type check if later on it's used as b: str = a). I don't know how we would efficiently check if a mutation breaks the type system, without running a type check for each mutation.
In order to organize my testing, I have marked all of my tests with pytest marks, ala @pytest.mark.unit. Reading this issue, I presume that in impacting my attempt to add mutmut to my testing regimen since all mutants get marked "not checked". Am I reasoning about this correctly or is some user error (not unlikely).
Thanks again for mutmut.
@tomwillis608 No, this is an unrelated issue. The tests themselves are not touched by mutmut.
@tomwillis608 I've added a commit on the main branch (https://github.com/boxed/mutmut/commit/1febb52642bf60e48c2221001cca931c49f2836c), so that we should report an error earlier for your case, and show some debugging context. If you cannot resolve this, feel free to open another issue and I can take a look.
And yes, decorating test functions is irrelevant, because we do not mutate them. We mutate the source code and there we skip decorated functions.