mypy icon indicating copy to clipboard operation
mypy copied to clipboard

Run without module detection logic

Open tuukkamustonen opened this issue 4 years ago • 19 comments

Feature

Allow mypy --no-module-detection (or similar flag) to run without mypy attempting to map files as modules. For example, linting standalone scripts should not require module detection magic.

Pitch

Mypy has (somewhat confusing?) file-to-module mapping logic explained in https://mypy.readthedocs.io/en/latest/running_mypy.html#mapping-file-paths-to-modules. The key take away is that mypy attempts to determine a module name for each imported file (in various ways). For example, a/b/c.py -> a.b.c.

However, the documentation doesn't give rationale for why this actually needs to be always done. If you merely want to check standalone files like mypy one.py then why does this require determining module name for the file?

(AFAIK, other linters like pylint don't have this kind of magic, perhaps because they don't follow imports like mypy does.)

If we have two files mypy one.py two.py and two imports one, then the detection logic is needed, so that module one is found (and it's fine).

However, if I have sub-directories with similarly named files, doing mypy one/file.py two/file.py throws:

two/file.py: error: Duplicate module named 'file' (also at 'one/file.py')

But these are two standalone files, in separate directories, not part of the same package. Why is the error there?

Similar trouble is faced when maintaining a monorepo with multiple packages:

package1/
    setup.py
    ...
package2/
    setup.py
   ...

Mypy will throw error about duplicate module name on setup.py files (and others).

Now, without arguing on how mypy should determine module name of the files is looks up (surely there's reasoning for that), I'm suggesting to add an option to completely skip this magic. In my case, I always pip install -e . the packages that I'm developing, so whatever imports the code (wherever), those imports are already resolvable. All the packages are already importable and whatever files reside out of the packages, are stand-alone scripts and never imported.

One downside is that something like this would no longer work:

package1/
    src/
        ...
    setup.py
    tasks.py  # imports util.py
    util.py

For that I don't have suggestion :( Could it be solved by multiple entries in MYPYPATH or PYTHONPATH?

tuukkamustonen avatar May 06 '21 10:05 tuukkamustonen

Does using --namespace-packages help with this?

atrigent avatar Jun 25 '21 14:06 atrigent

I'm not using namespace packages so no.

And mypy nags about duplicate modules Even with MYPYPATH=pkg1,pkg2 mypy --namespace-packages --explicit-package-bases ..

tuukkamustonen avatar Jun 28 '21 07:06 tuukkamustonen

I'm having the same issue. I'm working with a mono repo with multiple packages and getting duplicate module errors because of the same situation as your example above with the 2 setup.py's. Have you found a workaround or a way to scan these separately in the config file?

ns-dmichelini avatar Aug 04 '21 19:08 ns-dmichelini

You could try --exclude '/setup\.py$' if you don't care about checking your setup.py's. Alternatively, something like MYPYPATH=pkg1:pkg2 mypy --namespace-packages --explicit-package-bases . could help (note that MYPYPATH is colon separated, which is maybe why it didn't work for @tuukkamustonen )

hauntsaninja avatar Aug 04 '21 19:08 hauntsaninja

Not sure if my earlier paste is a typo or if I actually misused MYPYPATH but trying it now with colon and there's no change - still getting the error.

tuukkamustonen avatar Aug 09 '21 13:08 tuukkamustonen

My team has a case where we are using python scripts to generate a large, relatively-complex configuration tree. The directory structure and directory names of these scripts is a meaningful part of the configuration tree and that includes spaces in the directory names (ruling out making these directories in to packages without significant overhaul). Some of the scripts also have meaningful names, e.g. settings.py in multiple locations in the tree, so we get this "Duplicate module" error. We haven't found any way yet to typecheck this structure of configuration scripts alongside our project with mypy after digging through this and related issues (--scripts-as-modules and --namespace-packages don't help us). We really don't want to exclude any of these scripts from typechecking, but that has been our unhappy workaround so far.

Maybe we can pass the list of files in each directory to a separate call of mypy using a helper script? We're just surprised that there seems to be no escape hatch that enables us to typecheck a directory structure including one/foo.py and two/foo.py where one and two are not packages and shouldn't be made in to packages.

I feel a new flag, e.g. --skip-module-detection <dir> or --skip-module-detection <regex> makes sense to dodge this bullet for a subset of a project.

MattF-NSIDC avatar Sep 01 '21 17:09 MattF-NSIDC

I've faced mypy's "duplicate module" issue in a number of source bases, and I haven't found a good workaround short of renaming modules or telling mypy to ignore parts of the code base. I agree that a new mode would be really useful here. I'm not clear on why this limitation exists in the first place, since it's legal for a code base to have duplicate module names as long as they don't result in a namespace conflict. Perhaps someone who is familiar with the inner workings of mypy could explain why this check exists?

For what it's worth, pyright doesn't have this limitation, so it might suit your needs.

erictraut avatar Sep 01 '21 17:09 erictraut

Interesting. Thanks for your input, though I would have appreciated disclosure that you are the author of Pyright ;) We tried Pyright out, and it was indeed able to typecheck parts of the codebase which mypy refused to check due to the issue in this thread. It's certainly strange adding Node as a development dependency to our project, on the flipside.

MattF-NSIDC avatar Sep 01 '21 20:09 MattF-NSIDC

We tried out PyType because we liked that it could be installed with Python ecosystem tooling (pip), but it had the same problem with "duplicate" modules.

We landed on running mypy in 3 phases:

  1. For the whole project, excluding the problem scripts directory (LAYERS_CFG_DIR in the code)
  2. For the set of files with unique names in the problem directory, all at once.
  3. For the set of files with "duplicate" names in the problem directory, one file at a time.

Now our entire project is being typechecked again! But we have to deal with dozens of runs of mypy. We would really like to achieve this in one run with a flag like --skip-module-detection-in-dir {LAYERS_CFG_DIR}.

Here's the monstrosity if you'd like to adapt it to your project: https://github.com/nsidc/qgreenland/blob/700c9b9812dff4d2ee6827902572b1d59e76cc77/tasks/test.py#L46-L115

MattF-NSIDC avatar Sep 02 '21 19:09 MattF-NSIDC

Hi all,

Please, check if this contribution (#12496) could help you.

rprata avatar Mar 31 '22 04:03 rprata

I just literally did what mypy suggested and it worked:

$ mypy a/setup.py b/setup.py
b/setup.py: error: Duplicate module named "setup" (also at "a/setup.py")
b/setup.py: note: See https://mypy.readthedocs.io/en/stable/running_mypy.html#mapping-file-paths-to-modules for more info
b/setup.py: note: Common resolutions include: a) using `--exclude` to avoid checking one of them, b) adding `__init__.py` somewhere, c) using `--explicit-package-bases` or adjusting MYPYPATH
Found 1 error in 1 file (errors prevented further checking)

then

$ mypy a/setup.py b/setup.py --explicit-package-bases
Success: no issues found in 2 source files

tried this on recent commit from master branch.

ilevkivskyi avatar Nov 26 '22 10:11 ilevkivskyi

I had the same problem with a monorepo.. Fortunately ChatGPT came to the rescue

my structure:

./common/exceptions.py
./frame/exceptions.py
./pumps/exceptions.py

I added an __init__.py to each directory and the error vanished.

nick-brady avatar Mar 22 '23 17:03 nick-brady

I solved this by running multiple mypy checks on my make recipe.

@poetry run mypy --install-types --explicit-package-bases --fast-module-lookup --ignore-missing-imports --strict package2/.
@poetry run mypy --install-types --explicit-package-bases --fast-module-lookup --ignore-missing-imports --strict package2/.

mfdebby avatar Nov 09 '23 08:11 mfdebby

I have a project with a mypy.ini in the top-level:

[mypy]
files = cylc/flow

All the source files are in cylc/flow.

When I run mypy on its own, it picks up the config and runs successfully.

However, when I run mypy ., it fails with

tests/a/thing.py: error: Duplicate module named "thing" (also at "./tests/b/thing.py")

Even mypy . --config-file=./mypy.ini --explicit-package-bases doesn't help.

This is really frustrating, because it breaks VSCode's ability to run mypy.

MetRonnie avatar Jan 18 '24 17:01 MetRonnie

@MetRonnie Thanks, this explains why our CI pipeline stalls with Mypy. Our CI tooling provides a list of Python source file paths, but Mypy seems to (partially) scan other directories than just descendants of (files) src/. These irrelevant directories are very large.

sanmai-NL avatar Jan 18 '24 20:01 sanmai-NL

@MetRonnie Could you clarify what behavior you're expecting here? The docs for files here say (emphasis mine):

A comma-separated list of paths which should be checked by mypy if none are given on the command line.

which means that if you run mypy . the files config option shouldn't have any effect. If all the files you want to check are in cylc/flow then I think you could either use exclude to ignore the tests or try using mypy cylc/flow (possibly combined with adding cylc/flow to mypy_path and using --explicit-package-bases to force it to use cylc/flow as the base for the purposes of determining module names instead of the current directory like it seems to be doing at the moment).

pranavrajpal avatar Jul 04 '24 22:07 pranavrajpal

I have no particular expectations, all I'm saying is the way vscode-mypy works when linting a whole workspace is to pass the path to the workspace, and then this means the files setting in the workspace's config is ignored...

See https://github.com/microsoft/vscode-mypy/issues/157#issuecomment-1824614204:

I guess what you need would be for mypy itself to honor the exlude settings in mypy.ini when it's directly invoked for a file. But this would need an issue for mypy itself and not this extension.

MetRonnie avatar Jul 05 '24 10:07 MetRonnie

Same issue here, using pantsbuild to run mypy on a monorepo. There are conftest.py files in several submodules, and mypy fails with Duplicate module named "conftest". My current configuration:

# /mypy.ini
[mypy]
explicit_package_bases = True
namespace_packages = True

jasondamour avatar Aug 03 '24 16:08 jasondamour

I'm not using namespace packages so no.

And mypy nags about duplicate modules Even with MYPYPATH=pkg1,pkg2 mypy --namespace-packages --explicit-package-bases ..

I observe this still in 2024, and this directly contradicts what is documented in Bullet point 3 of https://mypy.readthedocs.io/en/stable/running_mypy.html#mapping-file-paths-to-modules , so I consider this an implementation bug.

szabi avatar Aug 23 '24 07:08 szabi

I have a plugin examples folder I want to type check, where "examples" contains several subfolders, each with a plugin.py file (which is executed as a script by the plugin runner).

I thought I had successfully worked around this restriction by typechecking the examples folder with mypy -p examples, but it turned out this only worked as long as none of the example plugins performed local imports from their own directory. Once they need to do that, the difference between top level script imports (from script_peer import some_name) and peer submodule imports (from .submodule_peer import some_name) causes mypy to complain that it can't find target modules when the script (correctly) uses a top level import to query its peers.

For my use case, turning off module detection entirely for these subtrees wouldn't be desired (since each script may have local packages alongside it), but a way to say that folders matching the pattern examples/plugins/* should be analysed independently of each other (so duplicate names aren't a problem) would be very helpful.

ncoghlan avatar Aug 07 '25 17:08 ncoghlan

The problem can be solved by passing --namespace-packages --explicit-package-bases UNLESS a parent folder is not valid module name (e.g., it contains -). Consider the following structure,

my-app/
├─ src/
│  ├─ ...
├─ tests/
│  ├─ __init__.py
│  ├─ ...
├─ packages/
│  ├─ package-1/
│  │  ├─ src/
│  │  │  ├─ ...
│  │  ├─ tests/
│  │  │  ├─ __init__.py
│  │  │  ├─ ...
│  ├─ package-2/
│  │  ├─ src/
│  │  │  ├─ ...
│  │  ├─ tests/
│  │  │  ├─ __init__.py
│  │  │  ├─ ...

When running the following command in my-app:

mypy --namespace-packages --explicit-package-bases .

tests module will always be considered just tests, as package-1 and package-2 are not valid names to be used as module names.

(the structure is a UV workspace structure)

jchalupka-pan avatar Sep 16 '25 07:09 jchalupka-pan

Just wanted to highlight @jchalupka-pan's https://github.com/python/mypy/issues/10428#issuecomment-3296296895 above. It's indeed a solution, as long as you obey Python module/package name syntax (ie. no hyphens) and take into account a few extra things (the latter part of this message).

Passing --explicit-package-bases makes the test/ directories to be detected as:

packages.package1.tests
packages.package2.tests

This avoids the duplicate module issue, but it's good to observe that it's hack.

(And to emphasize: the hack/fix here is that if you had packages/package-1 and packages/package-2, Mypy would ignore those directories as python packages, due to - character, and resolve the packages as just tests and tests.)

The real solution would be some mechanism/configuration/flag which would allow to validate the tests/ files without mapping them as modules. After all, the tests won't be importing each other, they don't need to be "registered".

(Btw. --namespace-packages is the default, I don't think it needs to be passed.)


There's more to emphasize in this solution. It requires to:

  1. Name the packages/* directories with Python package/module compatible syntax (so they can be registered as Python modules)
  2. Have certain mypy_path config
  3. Nest your code within src/ directory

(1) is explained above

(2) Without explicit mypy_path, Mypy will determine the package sources as:

packages.package1.src.package1
packages.package2.src.package1

(Actually, looks like they get registered as that AND as package1/package2. I don't understand how!?)

You need to set:

mypy_path = [
    "packages/package1/src",
    "packages/package2/src",
]

To register the sources as:

package1
package2

(3) If you don't nest your code in src/ but directly as packages/package1/package1 and packages/package2/package2 then for Mypy to detect the modules correctly you would need:

mypy_path = [
    "packages/package1",
    "packages/package2",
]

However, that would "fix" the original hack, and register the tests/ directories as:

tests
tests

...leading to the original duplicate modules problem.

tuukkamustonen avatar Oct 22 '25 14:10 tuukkamustonen