warehouse icon indicating copy to clipboard operation
warehouse copied to clipboard

Packages without releases should not be on /simple

Open cztomsik opened this issue 7 years ago • 7 comments

Describe the bug /simple/, list_packages and list_packages_with_serials all return removed packages.

Expected behavior Only packages listed on pypi.org should be returned.

To Reproduce Go to https://pypi.org/simple/ and search for package 0 or rever. These packages are not installable.

Additional context I've quickly skimmed through the code and I think (to my limited python knowledge) that removing project is actually doing the delete so it could be some kind of stale data in database.

cztomsik avatar Aug 10 '18 08:08 cztomsik

For a given package PyPI has Projects and Releases. The examples provided are Projects that were registered that never uploaded a release (previously allowed) or who uploaded releases that were later removed.

We probably need discussion on how if we want to filter them from /simple. cc @dstufft @di

ewdurbin avatar Aug 10 '18 12:08 ewdurbin

So it probably works as intended but I'd still argue that it's little unexpected. I mean if you call list_packages you will get 0 and rever (and probably lot of others) and none of those can be obtained through https://pypi.org/pypi/<pkg>/json

It seems bandersnatch has similar check https://bitbucket.org/pypa/bandersnatch/src/9fa97648f980d25f7c255f6d513da4bc6f6be2aa/src/bandersnatch/package.py?at=default&fileviewer=file-view-default#package.py-119

cztomsik avatar Aug 10 '18 13:08 cztomsik

I've revised the title a bit here to better describe what is happening.

I don't see any reason to list these packages (i.e. those without any releases) in /simple, they will never be installable, and removing them will make requests to /simple a little more lightweight as well.

di avatar Aug 10 '18 15:08 di

Awesome, what about those two xmlrpc api calls? list_packages and list_packages_with_serials?

BTW: I could probably help (I've noticed there is docker container) but I'm not sure how to correctly replicate this issue (maybe importing sql of few packages would help if you can export that for me).

cztomsik avatar Aug 10 '18 16:08 cztomsik

Not sure what the implications are of changing the XMLRPC responses. We generally shy away from invasive changes to them. For instance, we'd need to coordinate with projects like https://github.com/pypa/bandersnatch to ensure we're not interfering with their usage https://github.com/pypa/bandersnatch/blob/1d562e857b8b6755acac2341721b2d5e760eb920/src/bandersnatch/master.py#L80-L81

ewdurbin avatar Aug 10 '18 17:08 ewdurbin

Yeah, but we could probably add an option there (include_empty?). It's similar to /simple/

https://github.com/pypa/warehouse/blob/master/warehouse/legacy/api/xmlrpc/views.py#L198 https://github.com/pypa/warehouse/blob/master/warehouse/legacy/api/simple.py#L44

cztomsik avatar Aug 10 '18 17:08 cztomsik

Yeah, I agree with @ewdurbin, it's probably not worth it to change the behavior of the XML-RPC endpoints at this point in time, but probably something we should consider for the API that will eventually replace them.

di avatar Aug 10 '18 19:08 di