pylint
pylint copied to clipboard
E0401 (import-error) checks perform repeated file reads
Bug description
This is a follow-up to #9310, where I reported slowness with import-error checks due to repetitive I/O over SSHFS.
While profiling the new code, I noticed that the _is_setuptools_namespace checks in astroid cause the same files to be read over and over.
My public example repo shows the following reads:
- 109 reads -
pylint-corpus/src/__init__.py - 50 reads -
pylint-corpus/src/resources/sites/pages/page.py/__init__.py - 50 reads
pylint-corpus/src/resources/results/result.py/__init__.py
I'm hoping that the repeated reads can be prevented to speed up pylint. (My private repo has ~2,200 files and shows >20,000 repeated reads.)
Configuration
[MAIN]
jobs=1
[MESSAGES CONTROL]
disable=all
enable=E0401
[REPORTS]
reports=no
score=no
Command used
Steps to reproduce
git clone --branch import-error-stats https://github.com/correctmost/pylint-corpus.git
cd pylint-corpus
python ./profile_pylint.py
Analysis
strace shows the same files being opened repeatedly:
$ strace -e trace=openat python ./profile_pylint.py 2>&1 | sort | uniq -c | sort -nr
109 openat(AT_FDCWD, "pylint-corpus/src/__init__.py", O_RDONLY|O_CLOEXEC) = 3
50 openat(AT_FDCWD, "pylint-corpus/src/resources/sites/pages/page.py/__init__.py", O_RDONLY|O_CLOEXEC) = -1 ENOTDIR (Not a directory)
50 openat(AT_FDCWD, "pylint-corpus/src/resources/results/result.py/__init__.py", O_RDONLY|O_CLOEXEC) = -1 ENOTDIR (Not a directory)
It seems possible to avoid most of these reads with caching around _is_setuptools_namespace, but I wonder if _is_setuptools_namespace should even be called with a non-directory path (notice the ENOTDIR errors)?
Python profiling:
import pstats
stats = pstats.Stats('stats')
stats.print_callers('_io.open')
ncalls tottime cumtime
206 0.017 0.023 astroid/interpreter/_import/spec.py:329(_is_setuptools_namespace)
Pylint output
There is no output, just reduced performance
Expected behavior
Improved performance via caching or reduced filesystem accesses
Pylint version
astroid @ git+https://github.com/pylint-dev/astroid.git@a4a9fcc44ae0d71773dc3bff6baa78fc571ecb7d
pylint @ git+https://github.com/pylint-dev/pylint.git@500774ae5a4e49e2aa0c8d3f2b64613e21aa676e
Python 3.12.3
OS / Environment
Arch Linux
Additional dependencies
No response
Love those issues, keep them coming :heart: !