`--skip-existing` gives misleading feedback
Is there an existing issue for this?
- [X] I have searched the existing issues (open and closed), and could not find an existing issue
What keywords did you use to search existing issues?
skip
What operating system are you using?
macOS
If you selected 'Other', describe your Operating System here
No response
What version of Python are you running?
$ python --version
Python 3.11.8
How did you install twine? Did you use your operating system's package manager or pip or something else?
$ python3 -m pip install --upgrade twine
What version of twine do you have installed (include the complete output)
$ twine --version
twine version 5.0.0 (importlib-metadata: 7.0.2, keyring: 24.3.1, pkginfo: 1.10.0, requests: 2.31.0, requests-toolbelt: 1.0.0,
urllib3: 2.2.1)
Which package repository are you using?
test.pypi.org
Please describe the issue that you are experiencing
When I run twine upload with the --skip-existing flag, it says it skipped existing files and warnings, but not errors, are given. However, it also shows the colorful progress bar, which appears to indicate that it actually DID upload the skipped files.
Please list the steps required to reproduce this behaviour
-
$ python3 -m twine upload --repository testpypi --skip-existing dist/* Uploading mobyfubarbbq-0.0.1.post1-py3-none-any.whl 100% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 22.7/22.7 kB • 00:00 • 21.7 MB/s WARNING Skipping mobyfubarbbq-0.0.1.post1-py3-none-any.whl because it appears to already exist Uploading mobyfubarbbq-0.0.1rc2-py3-none-any.whl 100% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 22.6/22.6 kB • 00:00 • 42.0 MB/s Uploading mobyfubarbbq-0.0.1.post1.tar.gz 100% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 36.7/36.7 kB • 00:00 • 67.2 MB/s WARNING Skipping mobyfubarbbq-0.0.1.post1.tar.gz because it appears to already exist Uploading mobyfubarbbq-0.0.1rc2.tar.gz 100% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 36.7/36.7 kB • 00:00 • 82.9 MB/s
Anything else you'd like to mention?
I expect it to only show the WARNING lines without the Uploading and progress bar lines for mobyfubarbbq-0.0.1.post1.tar.gz and mobyfubarbbq-0.0.1.post1-py3-none-any.whl.
We can only determine if something exists reliably if we attempt to upload it. That's why there's a progress bar. If we hide that and only show when successful that could work but I don't believe that provides any value to the user then. Additionally, I don't believe we are able to retroactively hide it but maybe the underlying library has improved since last I looked
We can only determine if something exists reliably if we attempt to upload it.
Is this true? package_is_uploaded uses the JSON API to determine if the file already exists:
https://github.com/pypa/twine/blob/67e87ef4221123454403b11ee8f802e87fcc13fd/twine/repository.py#L202-L232
and this happens before upload:
https://github.com/pypa/twine/blob/67e87ef4221123454403b11ee8f802e87fcc13fd/twine/commands/upload.py#L165-L169
My read is that we'd only need to upload if a file doesn't appear in the JSON response for a project, and that this would only fail to upload if the file once existed but had been deleted.
With third party package indices there isn't a JSON API
Does the upload url return a 200 response if the file has already been uploaded? Could that be used to avoid an upload if --skip-existing is set?
In Chromebrew we just check for a 200 in the output of curl -sI <url> for our upload URL to determine if a file has been uploaded (though this doesn't check to see if the file has been properly uploaded), and then avoid an upload if that is the case.
It returns a 409 if I remember correctly, but more generally a 4xx response even if not a 409. This is also trivially confirmed by trying to reupload a release artifact without using this flag. I'm on a phone so I can't reproduce for you.
Hmm, just to be parsimonious with bandwidth, would it make sense to just search the index for an uploaded file before attempting an upload? Or just use the distinction in return codes to just look for the headers returned for an upload URL before uploading? 409 vs something else should be sufficient to tell them apart if they are indeed different?
On the non-pypi Gitlab package registry we also use we can definitely just look for the 200 code... Hmm, maybe that's gitlab only?
Every non-pypi registry will be different in my experience. Honestly, life would be simpler if there was a PyPI facade they could all use instead but alas the NIH is strong amongst a lot of these companies