pylint icon indicating copy to clipboard operation
pylint copied to clipboard

Add way to disable module caching for specific files

Open bpedersen2 opened this issue 6 years ago • 19 comments

  1. Consider a project with: a script bin/packagename a package packagename with some modules 2a. run pylint bin/packagename packagename/somemodule 2b. run pylint packagename/somemodule bin/packagename or pylint individually

Current behavior

2a results in complaints about non-name-in-package 2b runs without warnings

Expected behavior

Both runs can run without warning. The problem seems that in 2a the script is cached/kept loaded as a package.

Possibly related to #689

bpedersen2 avatar May 15 '18 05:05 bpedersen2

It's kind of hard to reproduce this without an actual reproduction case. As far as I can tell, we don't cache any script/filename/package so I'm not sure what you mean by that. Feel free to reopen if you have a particular example that I can use to reproduce this.

PCManticore avatar Aug 01 '18 08:08 PCManticore

I created a test repo: https://github.com/bpedersen2/pylint_reproduce run: 1.

export PYTHONPATH='.'
pylint  bin/project project/test_package/test_mod2.py
No config file found, using default configuration
************* Module project
C: 12, 0: Trailing newlines (trailing-newlines)
C:  1, 0: Missing module docstring (missing-docstring)
************* Module project.test_package.test_mod2
C:  1, 0: Missing module docstring (missing-docstring)
I:  3, 0: Module 'project' has not 'toplevel' member, but source is unavailable. Consider adding this module to extension-pkg-whitelist if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)

Versus

 pylint  project/test_package/test_mod2.py bin/project
No config file found, using default configuration
************* Module project.test_package.test_mod2
C:  1, 0: Missing module docstring (missing-docstring)
************* Module project
C: 12, 0: Trailing newlines (trailing-newlines)
C:  1, 0: Missing module docstring (missing-docstring)

------------------------------------------------------------------
Your code has been rated at 6.25/10 (previous run: 0.00/10, +6.25)

bpedersen2 avatar Aug 01 '18 11:08 bpedersen2

The seen modules are cached e.g. for the circular dep. check. The problem is that bin/project is seen internally as a module 'project', while it is not a module resolvable via a valid pythonpath. If I add an _init.py to bin (making it a package), everything works correctly, as the module then becomes 'bin.project', which does not collide with the toplevel project.

bpedersen2 avatar Aug 01 '18 12:08 bpedersen2

I think the correct logic would be: if a specified path 'xxx/yyy' is also importable as 'xxx.yyy', then it is a module to add, otherwise it is some standalone file that could e.g. get recorded as nopackage.xxx.yyy

bpedersen2 avatar Aug 01 '18 12:08 bpedersen2

i met this weird problem too. complaining no name in module but i already had and run well.

just for a new name added , and others are passed in the same file

i printed keys of the imported module from pylint/checkers/variables.py visit_importfrom and can't find the name which is already in the module. which means the module loaded isn't the lastest one. i even delete the module, and it can still print the original keys in the module.

pylint==1.9.5

this is due to __init__.py exists in root dir and the parent was added to path unexpectedly. and there is a subdir project copy of the origin one. namely:

parent/
  project/
    __init__.py
    subproject/
      __init__.py
  subproject/
    __init__.py

so lint find the outter one instead of the inner one subproject. after i delete the unnecessary __init__.py in project, it is fixed.

ericwu-wish avatar May 06 '20 13:05 ericwu-wish

just for a new name added , and others are passed in the same file

ericwu-wish avatar May 06 '20 13:05 ericwu-wish

i printed keys of the imported module from pylint/checkers/variables.py visit_importfrom and can't find the name which is already in the module. which means the module loaded isn't the lastest one. i even delete the module, and it can still print the original keys in the module.

pylint==1.9.5

ericwu-wish avatar May 06 '20 13:05 ericwu-wish

We are having a similar issue, and @bpedersen2 's reproduction seems valid. Could you consider reopening this issue?

benjyw avatar Jun 16 '21 16:06 benjyw

Generally, the issue seems to be that some repos have python files that are never intended to be imported (e.g., tests, scripts) and these can cause naming collisions with importable packages.

benjyw avatar Jun 16 '21 16:06 benjyw

Thanks! I'm wondering if it makes sense to cache only when an actual import is encountered in code? This would mirror importlib behavior. But I don't know enough about Pylint internals to know if what I just said makes any sense...

benjyw avatar Jun 16 '21 17:06 benjyw

Does astroid have knobs pylint could use to say when pylint wants the ast to be cached?

cognifloyd avatar Jun 16 '21 17:06 cognifloyd

I don't think there is a mecanism to restrict the caching to some portion of the code, right now it's all or nothing.

Pierre-Sassoulas avatar Jun 16 '21 17:06 Pierre-Sassoulas

I don't think there is a mecanism to restrict the caching to some portion of the code, right now it's all or nothing.

Right, I think we're wondering if it makes sense to add such a thing due to the real-world problems described above.

benjyw avatar Jun 16 '21 18:06 benjyw

This might be fixed with a better discovery of file to analyses. Making pylint . --exclude='some-regex' (#352 and maybe another one) works, would probably remove the need to launch such a command in the first place Also, I think the problem is not with caching per se, but with the fact that different files hit the same cache (because pylint is not using a proper absolute path to differentiate between them). Once pylint handle that, if file are ambiguous even with a proper abspath because of a change in pythonpath, we can't do anything about it.

Pierre-Sassoulas avatar Jun 16 '21 18:06 Pierre-Sassoulas

Excluding some files would mean they wouldn't get checked? That's not what we'd want.

But I agree that caching against the abspath (or even just the relpath from the CWD), and not the module name, would solve the problem.

Is this cache conserved across runs? Or is it just used within the scope of a single run? If it is then may I suggest caching against the relpath from the CWD, so that the cache is portable?

benjyw avatar Jun 16 '21 19:06 benjyw

I don't think the cache is conserved across runs. We do have a cache for the previous score (in order to do 6.25/10 (previous run: 0.00/10, +6.25)) when running on a file but that's something else I think.

Pierre-Sassoulas avatar Jun 16 '21 19:06 Pierre-Sassoulas

In that case caching against the abspath seems fine.

benjyw avatar Jun 16 '21 19:06 benjyw

I'm having this issue on a work project too. It's quite annoying because the project is structured around microservices and AWS lambdas + layers, so there are quite a few imports that rely on specific things being in the pythonpath which aren't necessarily reflected in the directory structure. In the end Python itself doesn't rely on directory structure alone, so linters probably shouldn't do that either.

Is there any way I can help short of contributing myself (which I can't do right now since I don't know the codebase at all and don't have time to learn it currently)?

berzi avatar Sep 14 '22 10:09 berzi

Not really. This seems like it might actually also be solved by https://github.com/PyCQA/pylint/pull/7357 which tackles https://github.com/PyCQA/pylint/issues/7339. However, that PR introduces a major performance regression which I haven't been able to solve as of yet..

DanielNoord avatar Sep 14 '22 10:09 DanielNoord