Question: use case for PyPI XMLRPC API
it was noticed that this uses an api for PyPI that has been in an indefinite state of "to be deprecated soon" in
https://github.com/sphinx-contrib/spelling/blob/22fe760921eb9cba4361d9fa71d785b4964daaea/sphinxcontrib/spelling/filters.py#L161-L166
Honestly, there's a good chance that this deprecation is growing nearer and nearer as PyPI begins to explore a modern API for other use cases, as soon as we're over that hump it it is only a matter of time before XMLRPC is turned off.
So the Question: "What is the current use case? Can I as a PyPI administrator and PSF Infrastructure maintainer help the maintainers of this project find a better option?"
Hi, @ewdurbin , thanks for reaching out.
We could at least disable the filter by default. I'll see if I can find some time to get to that.
@dhellmann It is fortunately already off by default :)
If you wanted to re-enable it....
Without digging too deeply it looks like you just need a list of strings?
curl -H"Accept: application/vnd.pypi.simple.v1+json" https://pypi.org/simple/ is supported way to get a JSON representation of all known project names on pypi.
ref: https://peps.python.org/pep-0691/
@dhellmann It is fortunately already off by default :)
Clearly it has been a while since I've looked at this code base. :-)
If you wanted to re-enable it....
Without digging too deeply it looks like you just need a list of strings?
curl -H"Accept: application/vnd.pypi.simple.v1+json" https://pypi.org/simple/is supported way to get a JSON representation of all known project names on pypi.
Excellent, thank you for the tip, that'll help someone who wants to rewrite it (future me, or someone else who needs it).
As I indicated in #213 - even if this can be rewritten in terms of a curl call, I'd suggest its a really bad idea to do so. There are 440k+ packages on PyPI, and every single one of them becomes a legal spelling word when this feature is turned on.
"Cute intentional misspellings of dictionary words" is a very common pattern of package naming - e.g., dropping the final vowel of flicker for flickr.
It won't pick up words that might be a violation of a project's style guide: namespace, phablet, or passthrough. This includes unilaterally accepting the US or UK spelling of any word that has a package by that name.
It won't pick up off-by-one typos: pypu instead of pypa or pypi
I'm sure an audit of the 440k package names on PyPA would reveal plenty of other "interesting" spellings.
And none of this takes into account the load that is put on the PyPA servers by downloading a 440k word list every time it rebuilds the spelling environment.
I'd strongly advocate for this entire feature being deprecated and removed.
If it is going to be retained, it should be reduced to the packages in the currently installed virtual environment, rather than the whole of PyPA (although I'd argue explicit inclusion in a dictionary is a much safer option).
@freakboy3742 Thank you for the advice. I don't intend to do anything at all with this for now, as I tried to make clear in an earlier comment here.