posy icon indicating copy to clipboard operation
posy copied to clipboard

Ignore sdists with malformed names

Open njsmith opened this issue 1 year ago • 3 comments

When reading https://pypi.org/simple/cffi, we currently see cffi-1.0.2-2.tar.gz and parse it as name: cffi-1.0.2, version: 2. And then in PackageDB::available_artifacts("cffi"), we end up filing this under version 2.

I don't think we can parse this sdist name in general -- at least without breaking much more common cases like scikit-learn-1.0.2.tar.gz. But a very simple thing we could do is, when reading a simple API page, ignore all entries whose name doesn't match the simple API page we're looking at!

(I guess we could also get fancier, and try to use the simple API page to bias the sdist name parsing? But I think stuff like cffi-1.0.2-2.tar.gz is super rare and we can probably just skip it.)

njsmith avatar Jan 24 '23 09:01 njsmith