Type ignore comments erroneously marked as unused by dmypy
There is currently a misbehaviour where "type: ignore" comments are erroneously marked as unused in re-runs of dmypy. There are also cases where errors disappear on the re-run.
As far as I can tell, this only happens in modules which contain an import that we don't know how to type (such as a module which does not exist), and a submodule which is unused.
There was a lot of commenting and investigation on this PR, but I hope that the committed tests and fixes illustrate and address the issue.
Related to https://github.com/python/mypy/issues/9655
I ran into this issue testing a pre-release of 1.3 and it has created enough confusion on the team that I'm not sure we can roll it out until this is fixed. I would love to see this merged before release.
I have discovered that this test passes when I revert https://github.com/python/mypy/pull/14835, so it's probably fair to consider this a regression. I haven't yet discovered exactly why fixing that bug caused this new failure.
Conversely, https://github.com/python/mypy/pull/15049 is not fixed by reverting my last PR.
Once https://github.com/python/mypy/pull/15179 reverts my previous PR, I will rebase this to include those original commits, and hope to come up with a solution that will make both the old and new tests pass.
EDIT: I've rebased. Now I need to work out how to fix this :thinking:
According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅
According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅
I've rebased again to get my tests running smoothly against the latest version of the code.
CI still fails on the regression test added in the last commit.
What follows is my best understanding of the issue as it stands. Please excuse me if I waffle a bit.
TLDR: I've finally tracked down the cause of that failure, but my current fix breaks loads of other stuff, so needs more work.
So as a recap, the setup for the remaining failing test looks like this:
[file unused/__init__.py]
[file unused/empty.py]
[file foo.py]
from unused.empty import *
import a_module_which_does_not_exist
def is_foo() -> str:
return True # type: ignore
The first run of dmypy works correctly and has this (fairly boring) output:
foo.py:2: error: Cannot find implementation or library stub for module named "a_module_which_does_not_exist" [import-not-found]
foo.py:2: note: See https://mypy.readthedocs.io/en/stable/running_mypy.html#missing-imports
But the second run adds this extra error on the end which is clearly wrong:
foo.py:4: error: Unused "type: ignore" comment [unused-ignore]
I believe this is because in the first run, when the ignored error is recorded we return without adding the error into error_info_map. https://github.com/python/mypy/blob/10dfafe089a75dc117586ebab35723da66309398/mypy/errors.py#L501-L506
Then in the second run when we fetch the list of things to re-check, foo.is_foo is missing from previous_targets_with_errors. https://github.com/python/mypy/blob/10dfafe089a75dc117586ebab35723da66309398/mypy/server/update.py#L189
... which is populated from the error_info_map (notice the comment here, it's relevant later). https://github.com/python/mypy/blob/10dfafe089a75dc117586ebab35723da66309398/mypy/errors.py#L929-L935
I have found that changing add_error_info so that it adds this ignored error into error_info_map (with info.hidden=True) does indeed solve the spurious Unused "type: ignore" comment error.
diff --git mypy/errors.py mypy/errors.py
index 4cdeb54f8..0fa365682 100644
--- mypy/errors.py
+++ mypy/errors.py
@@ -503,6 +503,9 @@ class Errors:
self.used_ignored_lines[file][scope_line].append(
(info.code or codes.MISC).code
)
+ if file not in self.ignored_files:
+ info.hidden = True
+ self._add_error_info(file, info)
return
if file in self.ignored_files:
return
Unfortunately, it also causes many other tests to fail. Some have errors like this:
mypy/test/testcheck.py::TypeCheckSuite::check-inline-config.test::testInlineIncremental1:
E AssertionError: Some modules reported error despite no errors
E assert (set() or not {'/home/charlie/code/mypy/test-data/unit/lib-stub/typing.pyi', 'main', 'tmp/a.py', 'tmp/builtins.pyi'})
I might be able to fix those by changing the helper functions on the Errors to filter out hidden errors, or add a new ignored flag on ErrorInfo for the same purpose. I haven't tried this yet. I'm hoping to find a more elegant solution.
Others fail like this:
Expected:
==
Actual:
==
builtins.pyi:18: error: Name "Iterator" is not defined (diff)
builtins.pyi:18: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Iterator") (diff)
... and some fail like this:
Expected:
==
main:4: error: Incompatible return value type (got "str", expected "int")
Actual:
==
main:4: error: Incompatible return value type (got "str", expected "int")
builtins.pyi:6: error: Name "typing" is not defined (diff)
Perhaps the fix for these errors would be to make sure that ErrorInfo's .target is always set to a dotted-Python-path (rather than a file path), so that they're caught by ignored_files. I haven't tried that out yet (or even proved the theory) but I think that's the next thing to try. I think this is supported by the previously highlighted comment: "TODO: Make sure that either target is always defined or that not being defined is okay for fine-grained incremental checking". I think it's not OK, and the target needs to always be defined as a dotted path.
Recap: the patch in the comment above fixed the tested error, but caused (many) other tests to fail.
I now think that my theory from the previous comment (final paragraph) was wrong.
With @samueljsb's assistance (thanks!), I've come to the conclusion that the patch is correct but might be revealing a previously unnoticed issue with the tests.
Example: mypy/test/testfinegrained.py::FineGrainedSuite::fine-grained.test::testAttrsUpdate2 fails with the patch applied with this error:
_____ testAttrsUpdate2 _____
[gw0] linux -- Python 3.10.12 /home/charlie/venvs/mypy/bin/python
data: /home/charlie/code/mypy/test-data/unit/fine-grained.test:940:
Failed: Invalid output (/home/charlie/code/mypy/test-data/unit/fine-grained.test, line 940)
----- Captured stderr call -----
Expected:
==
main:2: error: Missing positional argument "b" in call to "B"
Actual:
==
main:2: error: Missing positional argument "b" in call to "B"
builtins.pyi:18: error: Name "Iterator" is not defined (diff)
builtins.pyi:18: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Iterator") (diff)
But applying this further patch fixes the test:
diff --git test-data/unit/fine-grained.test test-data/unit/fine-grained.test
index 68f72a2aa..1977f709e 100644
--- test-data/unit/fine-grained.test
+++ test-data/unit/fine-grained.test
@@ -959,6 +959,7 @@ class A:
a = attr.ib() # type: int
other = attr.ib() # type: int
[builtins fixtures/list.pyi]
+[typing fixtures/typing-namedtuple.pyi]
[out]
==
main:2: error: Missing positional argument "b" in call to "B"
This leaves rather a lot of other failing tests yet to fix before this is ready to be merged. I haven't yet proved that all the other failing tests share this solution.
After applying the main patch and running against my codebase I still see that dmypy's second run's results differ from the first run's. I assume that is because of the issue outlined in https://github.com/python/mypy/pull/15049, which I am yet to fix.
I have found a nicer solution to the failing tests.
Changing is_errors_for_file to ignore hidden messages keeps the tests passing when the patch which adds ignored errors is applied.
EDIT: I was accidentally only running a subset of the tests locally, so I think I've only fixed some of them. Oops!
EDIT 2: Updating is_errors_for_file takes my local failures from 325 to 165.
Diff from mypy_primer, showing the effect of this PR on open source code:
mitmproxy (https://github.com/mitmproxy/mitmproxy): typechecking got 1.27x slower (49.8s -> 63.1s)
(Performance measurements are based on a single noisy sample)
According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅
Many of the failing tests relate to Iterator not being defined. I've found that use of this pattern in the files in test-data/unit/fixtures/ seems to be common to the cases I've looked at:
class Iterable(Protocol[T_co]):
def __iter__(self) -> 'Iterator[T_co]': pass
class Iterator(Iterable[T_co], Protocol):
def __next__(self) -> T_co: pass
I notice that Iterable is defined in terms of Iterator, and Iterator is defined in terms of Iterable. I would have expected that the quotes in the definition of __iter__ should be enough to make this a legal construction, but removing the __iter__ definition makes many of the tests pass. Unfortunately, it also makes many others fail.
I'm not yet sure if the solution here is to change how Iterator (or Iterable) is defined, or to work out why the quotes aren't doing the trick. If anyone has any insight on this issue, I'd appreciate the assistance!
The quotes shouldn't matter as this is a stub file, which isn't evaluated. The mutually recursive definition of Iterable and Iterator is indeed annoying to deal with, but mypy should be able to handle it.
I'm pretty sure that these stub files are handled in the tests, especially where we override test-data/unit/lib-stub/typing.pyi with declarations like this:
https://github.com/python/mypy/blob/2aa2443107534715a650dbe78474e7d91cc9df20/test-data/unit/check-incremental.test#L4154-L4155
https://github.com/python/mypy/blob/2aa2443107534715a650dbe78474e7d91cc9df20/test-data/unit/fixtures/typing-medium.pyi#L3-L4
I've found that using simpler stub files that lack this circular definition works in the case of many tests, so it feels like there must be some kind of evaluation going on.
Edit: Apologies if I've been over-picky on the term "evaluated" there -- I hope I understood your comment correctly. I can see that you're right about the quotes not being important in this case.
I'd like to try to help with this issue, and have some time next week.
From what I understand, the PR fixes the underlying issue, but causes some apparently unrelated tests to fail in a surprising way. (The odd behaviour is limited to tests and doesn't happen if we run mypy for real.)
I'm assuming this is a problem with the test machinery, and in particular with test fixtures that attempt to import Iterator.
My plan next week is to understand better what's going on by stepping through the test run in a debugger, but if anyone has any inkling as to what might be going on I'd be very grateful for any pointers. @chadrik or @JelleZijlstra I don't suppose you have any idea?
I've created a simple test case to show what is breaking. This test passes (but it shouldn't).
[case testDemo]
[file a.py]
[file a.py.2]
# Comment to trigger reprocessing.
[builtins fixtures/list.pyi]
[out]
==
builtins.pyi:18: error: Name "Iterator" is not defined
builtins.pyi:18: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Iterator")
This test is simulating the addition of a comment to an otherwise empty file. We are using the list.pyi fixture. The first time the module is checked (above the ==) there are no errors, but the second time (below the ==) we get the "Iterator" is not defined error.
I'll continue to try to understand why this is...
Good news - I think I've got somewhere. This WIP commit gets the number of failing tests down to only 3. 🥳
Background
I noticed while debugging that there were some additional errors in the error list following the first check run. These aren't surfaced anywhere obvious, I happened to find these by putting a breakpoint in the final statement of mypy.dmypy_server.Server.initialize_fine_grained and then drilling down to
self.fine_grained_manager.manager.errors.error_info_map.
Most of the errors were along these lines:
'Method "__getitem__" is not using @override but is overriding a method in class "typing.Sequence"'
'Method "__iter__" is not using @override but is overriding a method in class "typing.Iterable"'
These errors aren't present when I run it on master.
How I fixed it
I'm not sure if it's the correct thing to do, but I found adding from typing_extensions import override and then decorating the methods in question made the tests pass.
@chadrik @JelleZijlstra Do you have any concerns with this as a fix? If so, please advise what would be a better thing to do.
There were a couple of other test cases where instead the fix seemed to be just to add the appropriate builtins declaration in the test case.
Remaining failures
The last three are a bit different - I'll continue to look at these tomorrow.
Further context
The reason for the more visible Iterator errors seems to be that the semantic analyzer's globals aren't populated with the typing module when the second pass runs. I am not sure how this relates to the fix in question, I've only got limited understanding of what's going on so far.
Sounds like the error code explicit-override from #15512 is being activated erroneously in the test stubs. Adding the @override decorators feels more like a work-around for that.
Sounds like the error code explicit-override from https://github.com/python/mypy/pull/15512 is being activated erroneously in the test stubs. Adding the @override decorators feels more like a work-around for that.
Thanks for the steer, yes that makes sense.
(EDIT: this comment is probably superceded by the one below, feel free to skip.)
I've taken a different route and followed the logic around error reporting. I noticed a difference between the way other typesheds are handled and specifically the /tmp/builtins.pyi used by the test suite. That possibly needs handling as a special case. By way of a demo, this rather hacky commit adds a couple of special casing for builtins which gets the tests passing (well, the same tests as adding those overrides does). I doubt this is mergeable in its current form but perhaps it might give someone more familiar with the code base an idea of what the right fix might be?
I've found that I can get it down to one failing test with a couple of less-hacky feeling changes, see the commits on the end of this draft PR.
The main difference is that we filter by hidden targets when returning files with errors in mypy.errors.Errors.targets. However that does cause one of the new test cases to fail again.
mypy/test/testdaemon.py::DaemonSuite::daemon.test::testReturnTypeIgnoreAfterUnknownImport
My guess is that mypy.errors.Errors.targets should sometimes return hidden files and sometimes not, it depends on the context. Maybe we should have two methods instead of one. I'm finding this harder to debug though as the failing test is in daemon.test which doesn't play as nicely with my interactive debugger.
Would welcome any thoughts from those more familiar with the internals of the daemon.
I've had a quick call with @seddonym (thank you for your engagement with this), and we've worked out that this patch was recording errors in too many cases. While we wanted to keep track of ignored lines, the previous changes were also tracking errors with ignored error-codes, which explains why the tests were getting so upset. This latest change greatly reduces the number of failed tests. I've included @seddonym as a co-author.
Thanks also to @tmke8 for pointing us at the fact that this error was being registered incorrectly, as that really helped us to track this down.
~When run locally I still get a couple of wheel-related errors. They're up next!~
EDIT: All the tests pass locally after rebasing on the master branch! :tada:
According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅
According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅
Now that tests are passing, I've reworked the commit history so that dead-ends are removed, and they follow the order:
- Tests
- Refactoring
- Changes
I have a feeling that while the new tests now pass, there may yet be some undiscovered edge-cases. I'm going to dig in and see what I can find before opening this for review. It should be noted that these changes do not yet fix the tests in https://github.com/python/mypy/pull/15049.
According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅
I've found that this branch does indeed have a regression similar to the last one we found.
Expand this for details on how to reproduce the issue.
I can reproduce this one with these files:
unused/init.py: (empty)
unused/empty.py: (empty)
foo.py:
import attr
from unused.empty import *
import a_module_which_does_not_exist
@attr.frozen
class A:
def __init__(self) -> None:
self.__attrs_init__() # type: ignore[attr-defined]
Run these commands:
dmypy start -- --warn-unused-ignores --no-error-summary
dmypy check -- foo.py
dmypy check -- foo.py
Run 1 results:
foo.py:3: error: Cannot find implementation or library stub for module named "a_module_which_does_not_exist" [import-not-found]
foo.py:3: note: See https://mypy.readthedocs.io/en/stable/running_mypy.html#missing-imports
Run 2 results:
foo.py:3: error: Cannot find implementation or library stub for module named "a_module_which_does_not_exist" [import-not-found]
foo.py:3: note: See https://mypy.readthedocs.io/en/stable/running_mypy.html#missing-imports
foo.py:9: error: Unused "type: ignore" comment [unused-ignore]
I haven't yet ~~worked out how to make a regression test for this case in this PR yet, nor~~ investigated the cause.
This appears to be related to what I suspect may be an issue with the auto-detection of __init__ methods in the attrs plugin. I have created a test to show that issue in the linked PR.
According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅
This latest patch fixes all the errors I am aware of, including those in https://github.com/python/mypy/pull/15049.
My local code-base can now re-run dmypy without any changes between runs! :tada:
Before asking for review, I want to work out if we need all of the changes I've made here, or if the latest fix ("Don't remove modules with descendants") suffices.
~~I'll merge in the tests from https://github.com/python/mypy/pull/15049 soon.~~ EDIT: done.
According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅
According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅
I've now boiled this down the the minimal changes to fix the issue (though there are a lot of tests), and I think that this is finally ready for review!