basedpyright icon indicating copy to clipboard operation
basedpyright copied to clipboard

How to debug CPU 100% issue?

Open failable opened this issue 1 year ago • 2 comments

I like basedpyright.

I use basedpyright in small-scale projects, typically around 5000 lines of code. However, basedpyright often easily consumes all my CPU, causing the completion menu in my editor to appear very slowly, and the jump functionality becomes almost unusable. It takes several minutes after I stop editing for the CPU usage to decrease.

I'm using all the default configurations of basedpyright. I didn't encounter similar issues when I used pyright before, although I know basedpyright enables more default rules than pyright. Basedpyright always easily reaches a state of very high CPU usage.

Python version: 3.10.11

$ basedpyright --version
basedpyright 1.17.3
based on pyright 1.1.379

Common dependencies in my projects:

dependencies = [
    "fastapi>=0.112.0",
    "uvicorn>=0.30.5",
    "structlog>=24.4.0",
    "mysqlclient>=2.2.4",
    "aiomysql>=0.2.0",
    "polars>=1.4.1",
    "pandas>=2.2.2",
    "numpy<2",
    "pyarrow>=17.0.0",
    "async-lru>=2.0.4",
    "sqlalchemy[asyncio]>=2.0.32",
    "asyncmy>=0.2.9",
    "pydantic-settings>=2.4.0",
    "cryptography>=43.0.0",
    "setuptools>=72.2.0",
    "pillow>=10.4.0",
    "regex>=2024.7.24",
    "connectorx>=0.3.3",
    "oss2>=2.18.6",
    "httpx[socks]>=0.27.2",
    "rapidocr-onnxruntime>=1.3.24",
]
dependencies = [
    "fastapi>=0.111.1",
    "pydantic>=2.8.2",
    "pydantic-settings>=2.3.4",
    "httpx[socks]>=0.27.0",
    "trafilatura[all]>=1.11.0",
    "motor>=3.5.1",
    "structlog>=24.4.0",
    "pandas>=2.2.2",
    "qdrant-client>=1.11.0",
    "llm-taxi>=0.5.0",
    "mistralai==0.4.2",
]

Editors:

Zed

Zed: v0.152.1 (Zed Preview)
OS: macOS 14.6.1
Memory: 32 GiB
Architecture: x86_64

VSCode

Version: 1.92.2 (Universal)
Commit: fee1edb8d6d72a0ddff41e5f71a671c23ed924b9
Date: 2024-08-14T17:29:30.058Z
Electron: 30.1.2
ElectronBuildId: 9870757
Chromium: 124.0.6367.243
Node.js: 20.14.0
V8: 12.4.254.20-electron.0
OS: Darwin x64 23.6.0

failable avatar Sep 10 '24 02:09 failable

a couple things to try:

  • change typeCheckingMode to "standard" and see if it makes a difference
  • compare with pyright with all its rules enabled:
    [tool.pyright]
    typeCheckingMode = "strict"
    # pyright doesn't have an "all" type checking mode so we still need to enable a bunch of them specifically:
    deprecateTypingAliases = true
    enableExperimentalFeatures = true
    reportMissingModuleSource = "error"
    reportCallInDefaultInitializer = "error"
    reportImplicitOverride = "error"
    reportImplicitStringConcatenation = "error"
    reportImportCycles = "error"
    reportMissingSuperCall = "error"
    reportPropertyTypeMismatch = "error"
    reportShadowedImports = "error"
    reportUninitializedInstanceVariable = "error"
    reportUnnecessaryTypeIgnoreComment = "error"
    reportUnusedCallResult = "error"
    
  • does it also happen on older versions of basedpyright?

DetachHead avatar Sep 10 '24 02:09 DetachHead

@DetachHead Thanks for the suggestions, I will try that. I started to use basedpyright since the Zed extension of it was developing, around the middle of July. The issue existed for me since then. I am not sure if it happens for even older versions.

failable avatar Sep 10 '24 06:09 failable

I can confirm the same issue with VSCode on Mac M2 Air and more recent versions of basepright. I experienced CPU spikes (up to 100% on a single core), high memory spikes (up to 4GB), and extreme energy consumption (853,32 Mac OS energy units, whatever they are in the "Energy" tab). I do not experience any problems (except for pyright's intended problems) with the original pyright version 1.1.38 with the above configuration.

Basedpyright Version

basedpyright 1.23.1
based on pyright 1.1.391

VSCode Version

1.96.2
fabdb6a30b49f79a7aba0f2ad9df9b399473380f
arm64

VSCode Extension Version

Identifier: detachhead.basedpyright
Version: 1.23.2
Last Updated: 2025-01-07, 05:15:15
Size: 17.4 MB

EDIT: the project I'm working on is quite big (~20k lines), so it strongly shows the difference between pyright and basedpyright in this case. Collecting pyright statistics for comparison right now.

dartt0n avatar Jan 07 '25 07:01 dartt0n

can you try the things i mentioned in https://github.com/DetachHead/basedpyright/issues/659#issuecomment-2339504636 and let me know if any of them make a difference? thanks

DetachHead avatar Jan 07 '25 09:01 DetachHead

can you try the things i mentioned in #659 (comment) and let me know if any of them make a difference? thanks

After testing with pyright for 3 hours with the settings provided above, I still have CPU and memory spikes (seems it happens when refactoring large files), but memory usage has decreased significantly (down to 186,85 macos energy units). However, it seems like pyright does not analyse the whole workspace, while basedpyright do - that could be an issue. Right now testing with "standart" type checking, will come back with results

dartt0n avatar Jan 07 '25 10:01 dartt0n

Seems like "typeCheckingMode": "standart" setting solves the energy issue

dartt0n avatar Jan 07 '25 12:01 dartt0n

After testing with pyright for 3 hours with the settings provided above, I still have CPU and memory spikes

just to confirm, this was was with pyright and not basedpyright? if so, that means the issue must be caused by one of the diagnostic rules that are disabled by default in pyright but enabled by default in basedpyright.

if you have the time and patience (i don't blame you if you don't), the next steps would be to gradually remove some of those rules and see if the issue persists to see if we can identify which rule exactly is the culprit

also are you testing this with a public codebase? if so i can try and play around with it myself and see if i can reproduce the issue

DetachHead avatar Jan 07 '25 14:01 DetachHead

I will continue to use basedpyright on a daily basis because it solves so many problems that pyright has a hard time dealing with. I will try to set up a monitoring system for basedpyright process and periodically turn on/off various diagnostics to achieve a cleaner results ideally narrowing issue to single particular rule. But this would take a long time, since it need to track resources while I actively work on the project.

Unfortunately, I cannot share codebase, but can give some insights about it. It's python 3.12 mono repo with hexagonal architecture for multiple services with a lot of type checking (we have models for dto, for api schema, for db schema, for each interface and protocol, each message etc, basically everything is typed)

dartt0n avatar Jan 07 '25 15:01 dartt0n

Right now, I see that basedpyright analyzes every single file (200+) in the workspace every time VSCode File Save is triggered. Could this be related to ruff, which runs again the whole codebase formatting it and, therefore, touches files, and basedpyright treats this as a change in source code, even though it actually does not? image

UPD: Can confirm the same happening every time with pyright with the following config (each time I press cmd-s the whole project is re-analyzed):

openFilesOnly = false
useLibraryCodeForTypes = true
typeCheckingMode = "strict"
deprecateTypingAliases = true
enableExperimentalFeatures = true
reportMissingModuleSource = "error"
reportCallInDefaultInitializer = "error"
reportImplicitOverride = "error"
reportImplicitStringConcatenation = "error"
reportImportCycles = "error"
reportMissingSuperCall = "error"
reportPropertyTypeMismatch = "error"
reportShadowedImports = "error"
reportUninitializedInstanceVariable = "error"
reportUnnecessaryTypeIgnoreComment = "error"
reportUnusedCallResult = "error

UPD2: Seems like every time pyproject.toml is touched, the whole project is re-analyzed. Not sure why it is touched in first place

dartt0n avatar Jan 07 '25 17:01 dartt0n

that's odd, it should only be doing that if basedpyright.analysis.diagnosticMode is set to "workspace" instead of the default value "openFilesOnly". this is configured in vscode's settings so can you check to see if that setting has been changed?

DetachHead avatar Jan 07 '25 21:01 DetachHead

Yes, I use the "workspace" diagnostic mode for both pyright and basedpyright, but I started noticing problems only in this project, everything was going smoothly for a few months before that.

However, I notice that all project files are analyzed from time to time, as if it does not depend on my actions. I recorded how I write dummy code inside one of the project files and pay attention to the notification bar at the bottom, it shows that basedpyright constantly analyzes the entire project

Recoding: S3 Share Storage: https://storage.yandexcloud.net/dartt0n/share/basedpyright-issue.mp4 Jumpshare (24h link): https://jmp.sh/s/el2FJqX20s8Dv8eIfW5W

Settings:

// pyproject.toml
[tool.pyright]
venvPath = "."
venv = ".venv"
pythonVersion = "3.12"
pythonPlatform = "Linux"
reportUnknownMemberType = false
exclude = ["**/__pycache__", "**/.*"]
// settings.json
"basedpyright.analysis.autoImportCompletions": true
"basedpyright.analysis.autoSearchPaths": true
"basedpyright.analysis.diagnosticMode": "workspace"
"basedpyright.analysis.diagnosticSeverityOverrides": {}
"basedpyright.analysis.exclude": []
"basedpyright.analysis.extraPaths": []
"basedpyright.analysis.ignore": []
"basedpyright.analysis.include": []
"basedpyright.analysis.inlayHints.callArgumentNames": true
"basedpyright.analysis.inlayHints.functionReturnTypes": true
"basedpyright.analysis.inlayHints.genericTypes": false
"basedpyright.analysis.inlayHints.variableTypes": true
"basedpyright.analysis.logLevel": "Trace"
"basedpyright.analysis.stubPath": "typings"
"basedpyright.analysis.typeCheckingMode": "all"
"basedpyright.analysis.typeshedPaths": []
"basedpyright.analysis.useLibraryCodeForTypes": true
"basedpyright.disableLanguageServices": false
"basedpyright.disableOrganizeImports": false
"basedpyright.disableTaggedHints": false
"basedpyright.importStrategy": "fromEnvironment"

dartt0n avatar Jan 08 '25 06:01 dartt0n

thats very strange, from my understanding even when diagnosticMode is set to "workspace" it should only ever re-analyze files that depend on the file you edited. copying those settings and typing the same things as in your video does not seem to reproduce that issue for me.

are you able to reproduce it analyzing more files than it should on an empty project, for example with two files that do not import each other, does making a change in both files trigger a re-analysis of both of them?

also are you able to check if there's anything interesting printed in the basedpyright log when this happens? located here:

image

DetachHead avatar Jan 08 '25 23:01 DetachHead

When I set the log level to TRACE and monitor the Output tab in vscode, I can see both pyright and basedpyright reanalyzing all related files on each key press. This may not be noticeable for tiny projects, but it can be an issue for large ones. It seems like when the whole project is a big Python library, changing one of the core files triggers an analysis of the entire library. I can confirm this with Django source code: https://storage.yandexcloud.net/dartt0n/share/basedpyright-django.mp4

EDIT: I use pyright in the video recording, but the effect of basedpyright is the same.

dartt0n avatar Jan 08 '25 23:01 dartt0n

ok so i get the same behavior in the django codebase, even when setting typeCheckingMode to "standard". i noticed that commenting out all the imports in ./django/forms/__init__.py reduces the number of files that get re-analyzed so i'm assuming pyright just determines a lot of the codebase to depend on that module (note that __init__.py files are implicitly imported as well if a specific module within its package is imported).

as far as i can tell this behavior is expected, but it's hard to tell for sure unless i can see a concrete example of files being re-analyzed that have no connection at all to the file being modified.

back to the original issue with the CPU usage, tbh i don't know enough about the pyright codebase to even know where to begin investigating this, especially since you can reproduce it on pyright as well. it might be worth raising an issue there and see what eric traut says.

maybe setting diagnosticMode to "openFilesOnly" is the only option

DetachHead avatar Jan 09 '25 11:01 DetachHead

Thanks a lot for your contribution! I think the original issue can be closed, as for my case, I indeed see spikes in memory and CPU usage only when all files are being analyzed. Another possible solution would be to introduce a config option to run full analysis of pyright only on file saves, while maintaining LSP options for file edits (similar to Rust Analyzer or Gopls), however that would probably require some fundamental changes to the architecture.

dartt0n avatar Jan 09 '25 12:01 dartt0n