packaging-problems icon indicating copy to clipboard operation
packaging-problems copied to clipboard

In pypi, it is impossible to reupload a removed file.

Open Natim opened this issue 9 years ago • 103 comments

HTTPError: 400 Client Error: This filename has previously been used, you should use a different version.

Natim avatar Sep 04 '15 13:09 Natim

Also the previous version has been removed and is impossible to find.

Natim avatar Sep 04 '15 13:09 Natim

It's probably still available in the Fastly caches, which is why you need to use a new filename. The old filename will have been marked as to cache indefinitely so even if you could upload a filename with the same name, if they had already fetched the old version they would never get the new one.

daenney avatar Sep 04 '15 13:09 daenney

In my case it isn't a problem because it is the exact same file.

Natim avatar Sep 04 '15 13:09 Natim

See Donald's email at http://comments.gmane.org/gmane.comp.python.distutils.devel/22739

I've pushed changes to PyPI where it is no longer possible to reuse a filename and attempting to do it will give an 400 error "This filename has previously been used, you should use a different version."

hickford avatar Sep 04 '15 13:09 hickford

Npm did the same in 2014. See http://blog.npmjs.org/post/77758351673/no-more-npm-publish-f

While it is annoying to have to bump the version number for typos documentation changes, I believe in the long run, the benefits of greater reliability and data integrity are well worth it.

I presume the justification is the same for PyPI. It's an FAQ, so should probably go in documentation somewhere.

hickford avatar Sep 04 '15 14:09 hickford

Then we shouldn't allow people to remove their files if they cannot put them back.

Natim avatar Sep 04 '15 14:09 Natim

I think we should allow to reupload the same removed file

Natim avatar Sep 04 '15 14:09 Natim

There are very good reasons for the current behavior. Authors should be able to delete for any number of reasons (legal, security, etc.) Users of the package should be able to rely on getting the exact same thing every time they install a package of a specific version.

If you delete a package that someone relies on, they know the version is gone and they need to make a change to fix it. If you could delete a package and replace it with something different but with the same version, it can break their program is any number of subtle ways and it would be very hard to determine the cause of the problem.

Allowing this would break the entire version number contract. You may have what seems to be a good reason to replace a version but allowing it is not worth making versions unreliable.

tylerdave avatar Sep 04 '15 15:09 tylerdave

Absolutely.

hickford avatar Sep 04 '15 15:09 hickford

If you delete a package that someone relies on

You broke their package and you cannot put it back.

Natim avatar Sep 04 '15 15:09 Natim

If you could delete a package and replace it with something different but with the same version, it can break their program

That's not what I am asking for.

I am asking for putting back the package I removed.

Natim avatar Sep 04 '15 15:09 Natim

Allowing this would break the entire version number contract.

Allowing to put back the version you removed doesn't break any contracts. + You already have the previous package hash so you can check the version didn't change and that you are really re-uploading the file that you removed.

Natim avatar Sep 04 '15 15:09 Natim

There I agree. If it can be ensured via the hash that only the exact same package is uploaded to the same version then I don't see this being a problem in concept.

tylerdave avatar Sep 04 '15 15:09 tylerdave

So long as the documentation and confirmation makes it clear that unpublishing is permanent, then I think it's reasonable and prudent.

It is generally considered bad behavior to remove versions of a library that others are depending on! Even if a package version is unpublished, that specific name and version combination can never be reused. In order to publish the package again, a new version number must be used.

https://docs.npmjs.com/cli/unpublish

hickford avatar Sep 04 '15 15:09 hickford

To prevent malicious abuse, perhaps the policy should be strengthened to 'no uploads to old versions' https://github.com/pypa/packaging-problems/issues/75

hickford avatar Sep 04 '15 15:09 hickford

Unless you have the physical file laying around still, it's unlikely you're going to have something that matches the same hash. The setup.py sdist command does not have deterministic output, each time you run it even if the code hasn't changed. This also means you can't use setup.py to upload the file, since setup.py will only let you upload a file that it has created in the currently executing command, not an already created file. That doesn't make it impossible to upload a file with the same hash, but it makes it tricky which suggests it's a bad UX to expect authors to have to navigate.

Most likely the eventual solution to this is that "delete" won't actually be a full out absolute deletion, it'll be more like a soft delete where it just acts as if it's deleted without actually deleting it (so it won't show up in the API, won't appear anywhere, etc) but there will be a list of these deleted things when the author logs in and a button that says "Restore" that allows them to restore a file they've previously deleted. Possibly this would have a periodic cleanup where if something was soft deleted for some period of time (a month? 6 months? a year?) we'll go through and clean it up and actually hard delete it then. Perhaps we'd also enable it for authors to trigger an immediate hard delete of something they've soft deleted, but there would be plenty of big warnings that if they press that button there is no recovery possible.

dstufft avatar Sep 04 '15 15:09 dstufft

That doesn't make it impossible to upload a file with the same hash, but it makes it tricky which suggests it's a bad UX to expect authors to have to navigate.

With twine it is as simple as:

twine upload cliquet-2.5.0-py2.py3-none.whl

Natim avatar Sep 04 '15 15:09 Natim

Right, I wrote twine, but not everyone uses that so you have to explain to them that they have to use twine to be able to reupload not setup.py upload. In addition you have to explain to them they need the exact same file, not one created the same way. It's fiddly and people will get confused.

dstufft avatar Sep 04 '15 15:09 dstufft

People are not dumb, if they need to do something complicated they will eventually succeed. The fact is even if they know all the things, they won't be able to do it.

But yeah #75 is a workaround for now, (using .zip instead of .tar.gz for instance)

Natim avatar Sep 04 '15 15:09 Natim

As I already wrote it in #75

I think that behaviour is quite OK for the live repo.

Though, to be honest, it's a huge PITA for the test repo. I support integrity and all that stuff on "production" systems. However, developers need to have their code / packages checked somewhere and it's a PITA if you can't upload them same version twice while testing a new release.

There's no other way than the test repo to test your package. With git (or any other SCM) you can easily create a new branch and test it until you're sure everything works. Or if you've a look at PHP Packagist (compose) there's a -dev version for each development branch. On Docker the same, you can test your feature/release branches before tagging and going "live".

With the new policy you basically say: You've ONE SINGLE TRY and that one SHOULD WORK. No chance for a 2nd try. IMHO this isn't the purpose of a testing system and breaks the whole "we've a testing repo" idea. To be honest, I think this only leads to annoyed developers and a lot of "crippled versions" because developers couldn't properly test their versions before going live.

tl;dr: I suppose you do that on the live system but not on the test system.

domibarton avatar Dec 27 '15 00:12 domibarton

In my case, I forgot to sign the upload. It appears once you have uploaded the package it is impossible to fix any problems you made with the upload without making a new release. Even if you just want to upload the exact same version again.

brianmay avatar Mar 06 '16 10:03 brianmay

But how do you know it is "the exact same version"? Unless it checks the uploads are binary identical it would allow you to upload a totally different release with the same version which can cause any amount of problems.

daenney avatar Mar 06 '16 13:03 daenney

Just his what @domibarton is describing. What's the point of a test repo if you can't make mistakes?

torarnv avatar Mar 14 '16 21:03 torarnv

Just his what @domibarton is describing. What's the point of a test repo if you can't make mistakes?

Why cannot you do package x.y.z.dev0 and then package x.y.z.dev1?

Natim avatar Mar 14 '16 21:03 Natim

I could, and then having to remember to wipe those temp changes from my working tree before pushing to the live pypi repo.

torarnv avatar Mar 14 '16 22:03 torarnv

I uploaded a new version of Voltron yesterday and the server threw a 500 error during the upload. This resulted in a partial file being hosted as the current wheel for this package. The file size was smaller than my local one, and the hash differed.

This operation needs to be atomic. If the upload fails, you have no opportunity to try again. The only option is to use a different version number, which is not an appropriate solution.

IMO it should be a requirement that the hash of the upload is verified by the author before it is marked as "published".

snare avatar Apr 24 '16 07:04 snare

Yes I have the same problem with my last uploaded packages.

Natim avatar Apr 25 '16 08:04 Natim

My files were uploaded broken, due to a connection error, why can't I replace them?!

niedakh avatar Jun 03 '16 17:06 niedakh

I also suffer from this issue because of PyPI's recurrent HTTP 500 errors, which leads to incomplete uploads : the client errors out, but the file is created, which then makes it impossible to handle the intermittent upload error by simply retrying.

Having to bump version numbers just because of connection errors is quite annoying. Maybe the upload operation could be made atomic, so that it does not create a file unless everything completes normally (possibly by waiting for user confirmation).

pstch avatar Jun 13 '16 16:06 pstch

This might be helpful: https://mail.python.org/pipermail/distutils-sig/2016-June/029083.html

dstufft avatar Jun 13 '16 16:06 dstufft