[feat] uv publish --skip-existing
It would be great if uv publish had a --skip-existing option that ignores errors from files already on PyPI, similar to poetry's and twine's --skip-existing options.
Do you have a specific case that fails? uv publish already ignores it if you try to upload the same file again.
Mmmmhhh strange, I got this error:
10:57:30 uv build
10:57:30 Building source distribution...
10:57:30 Building wheel from source distribution...
10:57:30 Successfully built dist/rh_aws_saml_login-0.3.4.tar.gz and dist/rh_aws_saml_login-0.3.4-py3-****-any.whl
10:57:30 uv publish
10:57:30 warning: `uv publish` is experimental and may change without warning
10:57:30 Publishing 2 files https://upload.pypi.org/legacy/
10:57:30 Uploading rh_aws_saml_login-0.3.4-py3-****-any.whl (3.7KiB)
10:57:30 error: Failed to publish `dist/rh_aws_saml_login-0.3.4-py3-****-any.whl` to https://upload.pypi.org/legacy/
10:57:30 Caused by: Upload failed with status code 400 Bad Request: 400 File already exists ('rh_aws_saml_login-0.3.4-py3-****-any.whl', with blake2_256 hash '6c36e28fc15545d4e3d10a0c68e9d476afdb6ae2facdbb846a10098516575d0d'). See https://pypi.org/help/#file-name-reuse for more information.
10:57:30 make: *** [Makefile:39: pypi] Error 2
TBH, it's not exactly the same file, it's the lastest master, but it's the same version.
Could you give some more context on why you're doing the second upload? In the final uploaded version, would the source distribution and the wheel be from different commits?
I'm sorry. Our development and release flow is the following: Users create PRs, and each PR is merged into the main branch. The main branch is automatically built and published to PyPI. As long nobody bumps the version number in pyproject.toml, we don't want to publish (release) a new version. We used poetry in the past, and the --skip-existing option did the job.
Do you have a specific case that fails?
uv publishalready ignores it if you try to upload the same file aga
No it doesn't. UV will try to upload all pre-existing ones.
e.g. if I have published version 1.0.0, now wants to build and publish 1.0.1 locally, it it will try to upload 1.0.0 as well. I have to manually remove the older versions to make it work.
e.g. if I have published version 1.0.0, now wants to build and publish 1.0.1 locally, it it will try to upload 1.0.0 as well. I have to manually remove the older versions to make it work.
Are these 1.0.0 files the same that are on the index, and when uploading, do you get a "already exists but ok" message or do you get an error?
for the ci workflow @chassing mentioned (which is similar to what we use) the files aren't necessarily identical. they point is to use "this file already exists in the index" as a shorthand for "version hasn't changed, no need to upload" (by treating that specific error as being ok). i ported the twine approach to poetry, it's based on some heuristics https://github.com/pypa/twine/blob/ae71822a3cb0478d0f6a0cccb65d6f8e6275ece5/twine/commands/upload.py#L58-L72 so a bit hacky, but it works very well in practice
So I can verify that if the file is identical, the uv publish will pass, but will break if the code changes. And there is no info on "already exists but ok"
uv 0.4.18 (7b55e9790 2024-10-01)
@konstin -- Where did we land on this?
It seems problematic to try to upload different content and expect it to be treated the same way as uploading an identical file? It also seems reasonable to opt-in to ignoring failures in that case? Unless we want to add a command to check if a matching version exists or something? But that exposes you to TOCTOU bugs.
Yeah agree, which would lead us to adding a skip existing flag IIUC.
(if you decide to accept the feature request i'd be interested in having a go at implementing it (with some pointers))
With pypi today, a new release is created once you upload the first file for that release. Uploading additional files with the same package name and version number but a different filename adds them to the existing release. That is suboptimal: A release is published once a single file exists even if the wheels the users want are still missing, and it means that if something in your release process fails, you have a release with half the files. The pypi admins are also aware that this is undesirable, and there is work going on to fix this (https://discuss.python.org/t/pep-694-upload-2-0-api-for-python-package-repositories/16879).
Separately, unfortunately it seems that different implementations differ in how they react to reuploading files
- PyPI: When uploading the same file again, you get a 200 OK with an empty body. When uploading a different file, you get an error. We can’t tell apart whether an upload was successful or skipped.
- GitLab: When uploading a file with the same name, you get a “400 Bad request - Validation failed: File name has already been taken”, no matter whether the file was identical or not.
- Codeberg/forgejo: Error 409 “package file already exists”, no matter whether the file was identical or not.
The current skip existing behavior was based on PyPI’s behavior, while copying twine response behavior for compatibility with alternative indexes (the ones @davidszotten mentioned) as default without parameters (misunderstanding how those behave for non-pypi), resulting in a strange mix of behaviours.
Skip existing packages in its basic form is necessary: If a CI publish task fails mid-way uploading files, the user should be able to restart the whole workflow, with only new files being reuploaded.
Preferably, we’d error when we try to upload files that have the same name when uploading, but different content, otherwise the following scenario could: You build foo 1.2.3 into foo-1.2.3.tar.gz and foo-1.2.3-cp39-abi3-manylinux2014.whl, but upload only the source distribution foo-1.2.3.tar.gz (let’s say, because they wheel build failed or because the CI machine crashed after the source dist upload). Now you make some code changes, rebuild foo 1.2.3 and try uploading foo-1.2.3.tar.gz and foo-1.2.3-cp39-abi3-manylinux2014.whl. foo-1.2.3.tar.gz already exists, so only foo-1.2.3-cp39-abi3-manylinux2014.whl gets uploaded. Due to the code changes, the published source distribution and wheel mismatch: A mac or windows user building from the source distribution will get different code than the linux user, breaking the contract (and core assumption in packaging tools) that all files in a release match.
One option is adding index url as mandatory parameter to skip-existing: By using the hashes in the index URL, we check before uploading whether a file with the same filename does not yet exist (new upload), a file with the same name and same content/hash exists (we can skip it, such as a 1.0 that is still present in your dist/ when you're now publishing 1.1) or a file with the same name but different content exists (we error). The disadvantage is that this is more complex both on the user side and the uv side than twine's --skip-existing.
e.g. if I have published version 1.0.0, now wants to build and publish 1.0.1 locally, it it will try to upload 1.0.0 as well. I have to manually remove the older versions to make it work.
We could take a similar approach as poetry publish here and require the project's pyproject.toml to be present (either in the current directory or as part of the workspace), then only upload the version defined there. This would change CI workflows in some cases because it requires a checkout step for the publish step (e.g. if you build multiple wheels for different platforms, then collect and upload in a separate job, that job needs a checkout too).
i agree that the current implementations are a bit of a hack (and from your detailed description, perhaps even dangerous). just a note that the proposal of checking version against pyproject.toml doesn't satisfy mine (or op's i think) use-case of using skip-existing as a proxy for "was the version just bumped". (i'm open to ideas of other ways of achieving this though)
In my workflow, I would also need to generate a dev version on the fly and publish a random dev version on every merge to main. Please also keep this in mind, when you devise uv publish. (In the end, probably just edit version in pyproject.toml in CD, and not interfere the workflow @konstin proposed here.)
@TiansuYu Could you expand on your overall workflow, i.e. how does the wheel get its random dev tag and where/how does this wheel from main get used?
I would somehow edit the version string e.g. 1.0.0 -> 1.0.0.dev0 (using a script or something, or if uv can offer something like uv version dev kinda like poetry version patch but append a random version string.) Then run uv build and uv publish. This is what I mean.
I also use a "publish development release on main" pattern
https://github.com/zanieb/poetry-relax/blob/5706001e383a2f3327f4ab7c2ced2b57d39e287a/.github/workflows/build.yaml#L51-L60
https://github.com/zanieb/poetry-relax/blob/5706001e383a2f3327f4ab7c2ced2b57d39e287a/scripts/version#L21-L25
For skip existing we can use the following design:
The user can pass --keep-existing <index url>. For each file given to upload, uv checks whether the file is already on the index. If the filename does not exist, uv will upload the file. If the filename does exist and the file on the index is the exact same as the local file (hash match), we skip the upload as there's nothing to do. If the file exists and mismatches with different content, we error: There is an inconsistency, and uv does not upload in this case.
The upload itself may then error or succeed. If it errors, we check the index URL again: Maybe there was a parallel upload of the same file, and the other upload was quicker (avoid TOCTOU errors). We then apply the same logic: If the filename does exist and the file on the index is the same (hash match), it's ok, there was an upload race. If the file exists and mismatches, we error: There is an inconsistency, and uv does not upload in this case. I'm not sure yet if we should do the second check only if the error is one of the twine-recognized file-already-existed conditions; i tend to think we can just do the second check on any error.
This enables re-trying publish, but it still requires the author of a package to ensure that they only try to publish each version from a specific commit. We impose this requirement to avoid situations where different files in a version were built from different sources, which breaks a fundamental assumption in uv.
Popping in to say I'm affected by this. We run a monorepo and all packages are built when a tag is pushed, but the tag doesn't change the version of all packages because of prefixing. So it tries to publish all built packages, but not all of them have a new version. twine's skip existing means that only the genuine changes are published. uv in its current state gives
$ uv publish
warning: `uv publish` is experimental and may change without warning
Publishing 10 files https://index.company.com/...
Uploading my_pkg-0.2.1.dev136+g27f686d-py3-none-any.whl (2.4KiB)
error: Failed to publish `dist/my_pkg-0.2.1.dev136+g27f686d-py3-none-any.whl` to https://index.company.com/...
Caused by: Upload failed with status code 403 Forbidden. Server says: {
"errors" : [ {
"status" : 403,
"message" : "Not enough permissions to delete/overwrite artifact 'index:my_pkg/0.2.1.dev136+g27f686d/my_pkg-0.2.1.dev136+g27f686d-py3-none-any.whl' (user: 'user' needs DELETE permission)."
} ]
}
I could build a script to check which build are changed compared to the index and only publish those ones, but whyh would I when twine upload --skip-existing exists.
Did you look at --check-index support? https://github.com/astral-sh/uv/pull/8531
Hey @charliermarsh we are impacted by this as well - --check-url is surprising in how it works in that it will error if the file exists and does not match the local hash. This is fundamentally different than how we expect something akin to --skip-existing to work. With --check-url, we get an error error: Local file and index file do not match. The expected behavior is that if the filename exists in the index we've pointed uv to, it should exit successfully.
@fwallacevt -- I hear you, though that is an intentional design decision. We think it should be an error if you're trying to upload a package to an index and the contents differ from the existing version. It's a sign that something is off in the build or publish process.
@charliermarsh we're using uv workspaces, and right now we run uv build ... --all, then publish everything that is built. What would you suggest as an alternative to only build packages if they should be built (e.g. version has changed)? Or is there another way you envision this workflow?
I'd recommend explicitly selecting the package(s) that changed and you want to publish. I know it's more effort than having the index figure it out, but it avoids publishing inconsistent sets of distributions for a version.
That's our use-case too @fwallacevt... I guess we'll stick with twine for just the publishing step.
@konstin @charliermarsh are you open to making this an optional feature? E.g. adding a flag that dictates whether a conflict results in a warning or an error?
Another option might be to use different exit codes for uv publish to be able to make "local and remote hash don't match" and "some other error occurred" programmatically distinct errors. Then, for this use case, the exit code could be checked and ignored only if it's the "local and remote hash don't match" case. uv publish || true would work as-is but it risks ignoring the "some other error occurred" cases which is not great.
Edit: I've ended up just checking for the string "Local file and index file do not match" in stderr when uv publish exits non-zero and then ignoring the failure in that case for now, but I would really prefer a solution that doesn't evoke Hyrum's Law.