basedpyright icon indicating copy to clipboard operation
basedpyright copied to clipboard

Docstrings for standard library

Open Kayzels opened this issue 11 months ago • 7 comments

One feature that Pylance has that I'm sorely missing is that they have detailed docstrings for the Python standard library. Pyright seems to have very few, if any. I'm not sure if Microsoft custom wrote the docs that they use for Pylance, or where they come from. But if they're available somewhere, it would be great to have that as part of this LSP, for Pylance parity.

Kayzels avatar Mar 17 '24 19:03 Kayzels

The doc popup should also link to docs.python.org like PyCharm does: image

KotlinIsland avatar Mar 18 '24 00:03 KotlinIsland

I would love this too, currently I have the library saved locally and have to add it to my extra paths.

karimlevallois avatar Mar 21 '24 11:03 karimlevallois

These stdlib docs actually seem to come from the local Python installation rather than anything pylance is doing special. You just need to ensure that client.config.settings.python.pythonPath is pointed to your path.

karimlevallois avatar Apr 05 '24 22:04 karimlevallois

there does seem to be a difference between pylance and pyright here. for me in vscode, pylance shows docs for the builtins while pyright doesn't, with the same value set for python.pythonPath

DetachHead avatar Apr 06 '24 02:04 DetachHead

Pyright can show the docs, you just have to currently point extraPaths to a folder with them in. The thing is, where to point the extraPath to and how to gain access to the stdlib. Someone has done it, will ask him how he exposed them. He’ll jump on this issue later…

karimlevallois avatar Apr 06 '24 03:04 karimlevallois

So I am the "someone" mentioned by @karimlevallois.

Here is a python script stripped from Pylance which can be used to generate most docstrings for the stdlib for the python which runs it. module_scraper.zip

Usage:

$ python main.py OUTPUT_DIR

Here are generated example stubs samples

Those generated stubs are only for the specific OS / Python version. So it may contains more/less functions, function args if the user use it for a different OS / Python version. Here's a screenshot for using it in Sublime Text.

image

I use python.pythonPath with that script to generate stdlib's stubs in my secret Sublime Text plugin (I have invited @DetachHead to that private repository but maybe he/she's not interested.)

jfcherng avatar Apr 06 '24 15:04 jfcherng

Here is a python script stripped from Pylance

thanks for the info, though i'm a bit paranoid about stuff like this. since pylance is closed source i don't think we can legally use any of its source code (or decompiled code) in basedpyright. i'd rather not risk getting a cease and desist from microsoft.

i'm obviously not a lawyer but from my understanding, clean-room reverse engineering means that you're allowed to decompile pylance and tell me how it works, but i can't do it myself (and this may also prevent you from being able to contribute code, since you now have knowledge of the proprietary plugin)

DetachHead avatar Apr 07 '24 00:04 DetachHead

i think the ideal solution would be to just add docstrings to builtins.pyi in typeshed. i've raised https://github.com/python/typeshed/issues/12085

if that gets rejected, we can consider a different approach

DetachHead avatar Jun 02 '24 05:06 DetachHead

Jedi language server shows docstrings for standard library functions afaik. Would it be possible to use Jedi's implementation for getting docstrings? I do not know how Jedi implements it, but it is open sourced under MIT.

https://github.com/pappasam/jedi-language-server https://github.com/davidhalter/jedi

alven-shin avatar Jun 08 '24 17:06 alven-shin

Jedi language server shows docstrings for standard library functions afaik. Would it be possible to use Jedi's implementation for getting docstrings? I do not know how Jedi implements it, but it is open sourced under MIT.

Jedi is able to get stdlib's docstring because it's run in Python. https://github.com/davidhalter/jedi/blob/065580b5d4823f49ed033ec3ce51b4b1fafd370d/jedi/inference/compiled/access.py#L190-L191

>>> import inspect
>>> inspect.getdoc(str.strip)
'Return a copy of the string with leading and trailing whitespace removed.\n\nIf chars is given and not None, remove characters in chars instead.'

jfcherng avatar Jun 08 '24 17:06 jfcherng

it turns out that pylance module scraper is MIT licensed, so i think that means we're allowed to use it

DetachHead avatar Jun 08 '24 23:06 DetachHead

it turns out that pylance module scraper is MIT licensed, so i think that means we're allowed to use it

Awesome!

karimlevallois avatar Jun 08 '24 23:06 karimlevallois

it turns out that pylance module scraper is MIT licensed, so i think that means we're allowed to use it

Just a FYI. On my side, somehow pyright seems to always prefer using its bundled typeshed's stdlib stubs. So eventually I use .py extension for generated stdlib stubs so that pyright thinks they are sources and use docstrings from them.

jfcherng avatar Jun 11 '24 08:06 jfcherng

Just a FYI. On my side, somehow pyright seems to always prefer using its bundled typeshed's stdlib stubs. So eventually I use .py extension for generated stdlib stubs so that pyright thinks they are sources and use docstrings from them.

thanks! i got it working too using this method (i also had to put them in a separate folder and add it to extraPaths in my config). i think i'll make basedpyriht dump these doc stubs in a separate folder next to typeshed-fallback and add it to the paths by default.

a question to consider before going forward with this - when do we want the module scraper to run? i think we have 2 options:

  1. on the user's machine when the language server starts
    • pros: will always have the docs for the correct python version
    • cons: language server startup will be slower, won't work on the playground (currently) because it will require a python interpreter at runtime
  2. when building the wheel
    • pros: no performance impact, works on the playground
    • cons: would need to either build separate wheels for each python version/platform or generate doc stubs for every version/platform in the same wheel, docs will be outdated if user changes python version and doesn't reinstall basedpyright

DetachHead avatar Jun 12 '24 14:06 DetachHead

Fwiw, obviously pylance is 1.. If you want to go 2., probably generate for the last py38~py312 is enough at this moment (patch number version won't have feature changes). As for platforms, probably we need a GitHub CI script for all of them (and py versions).

jfcherng avatar Jun 12 '24 14:06 jfcherng

docs will be outdated if user changes python version and doesn't reinstall basedpyright

If you change python version, you would be changing venv right? So bpr would always be different.

As for platforms, probably we need a GitHub CI script for all of them (and py versions).

https://github.com/pypa/cibuildwheel

KotlinIsland avatar Jun 12 '24 14:06 KotlinIsland

I've actually made a script to add docstrings to pre-existing stubs over here: https://github.com/AThePeanut4/docify

It uses libcst to parse the stub file and inject docstrings into it. It also handles sys.version and sys.platform conditional blocks, and is able to modify the files in-place. It won't overwrite existing docstrings, so you can incrementally add docstrings for multiple versions/platforms by running it multiple times on the same set of files.

It seems to work on python 3.9-3.12, and on Linux and macOS. Haven't got around to testing on Windows.

I wrote the initial version shortly after I started using basedpyright, not long after this issue was created - only recently found the time to clean it up and make a repo and all that.

AThePeanut4 avatar Jun 12 '24 14:06 AThePeanut4