spelling icon indicating copy to clipboard operation
spelling copied to clipboard

Question: use case for PyPI XMLRPC API

Open ewdurbin opened this issue 3 years ago • 8 comments

it was noticed that this uses an api for PyPI that has been in an indefinite state of "to be deprecated soon" in

https://github.com/sphinx-contrib/spelling/blob/22fe760921eb9cba4361d9fa71d785b4964daaea/sphinxcontrib/spelling/filters.py#L161-L166

Honestly, there's a good chance that this deprecation is growing nearer and nearer as PyPI begins to explore a modern API for other use cases, as soon as we're over that hump it it is only a matter of time before XMLRPC is turned off.

So the Question: "What is the current use case? Can I as a PyPI administrator and PSF Infrastructure maintainer help the maintainers of this project find a better option?"

ewdurbin avatar Apr 13 '23 01:04 ewdurbin

Hi, @ewdurbin , thanks for reaching out.

We could at least disable the filter by default. I'll see if I can find some time to get to that.

dhellmann avatar Apr 13 '23 21:04 dhellmann

@dhellmann It is fortunately already off by default :)

coderanger avatar Apr 13 '23 21:04 coderanger

If you wanted to re-enable it....

Without digging too deeply it looks like you just need a list of strings?

curl -H"Accept: application/vnd.pypi.simple.v1+json" https://pypi.org/simple/ is supported way to get a JSON representation of all known project names on pypi.

ewdurbin avatar Apr 13 '23 21:04 ewdurbin

ref: https://peps.python.org/pep-0691/

ewdurbin avatar Apr 13 '23 21:04 ewdurbin

@dhellmann It is fortunately already off by default :)

Clearly it has been a while since I've looked at this code base. :-)

dhellmann avatar Apr 13 '23 21:04 dhellmann

If you wanted to re-enable it....

Without digging too deeply it looks like you just need a list of strings?

curl -H"Accept: application/vnd.pypi.simple.v1+json" https://pypi.org/simple/ is supported way to get a JSON representation of all known project names on pypi.

Excellent, thank you for the tip, that'll help someone who wants to rewrite it (future me, or someone else who needs it).

dhellmann avatar Apr 13 '23 21:04 dhellmann

As I indicated in #213 - even if this can be rewritten in terms of a curl call, I'd suggest its a really bad idea to do so. There are 440k+ packages on PyPI, and every single one of them becomes a legal spelling word when this feature is turned on.

"Cute intentional misspellings of dictionary words" is a very common pattern of package naming - e.g., dropping the final vowel of flicker for flickr.

It won't pick up words that might be a violation of a project's style guide: namespace, phablet, or passthrough. This includes unilaterally accepting the US or UK spelling of any word that has a package by that name.

It won't pick up off-by-one typos: pypu instead of pypa or pypi

I'm sure an audit of the 440k package names on PyPA would reveal plenty of other "interesting" spellings.

And none of this takes into account the load that is put on the PyPA servers by downloading a 440k word list every time it rebuilds the spelling environment.

I'd strongly advocate for this entire feature being deprecated and removed.

If it is going to be retained, it should be reduced to the packages in the currently installed virtual environment, rather than the whole of PyPA (although I'd argue explicit inclusion in a dictionary is a much safer option).

freakboy3742 avatar Apr 16 '23 23:04 freakboy3742

@freakboy3742 Thank you for the advice. I don't intend to do anything at all with this for now, as I tried to make clear in an earlier comment here.

dhellmann avatar Apr 17 '23 18:04 dhellmann