warehouse
warehouse copied to clipboard
Possibly flaky zip confusion detection in the upload endpoint
Describe the bug
I triggered a long-existing release automation in GHA and in failed with a
ERROR HTTPError: 400 Bad Request from https://upload.pypi.org/legacy/
Invalid distribution file. ZIP archive not accepted: Trailing data
I eventually realized that this must be https://github.com/pypi/warehouse/pull/18492/files#diff-c82bacd0e7b8c5b66fa409fb14ee9258d051e6eb5e33102887251641b5bd9747R310.
But I don't understand what might've caused it or how to validate things. The GHA workflow is stuck on an older cibuildwheel because of upgrade blockers so I thought it might've caused the problem. But I really don't know.
One point of confusion was that the same workflow has jobs uploading to PyPI and TestPyPI. And the TestPyPI upload didn't have any problems.
Later, I triggered a new release and it succeeded. It's a mystery why the first publishing attempt failed on one of the wheels but I suspect that the check might be flaky or the dist got corrupted during upload.
Expected behavior
The check should not produce false-positives or at least give instructions on how to check the seemingly illegal dists locally.
To Reproduce
Zero clue.
My Platform
GHA but doesn't really matter.
Additional context
This is all I've got for now. It's not much but I figured it's better to have it documented in public for possible future investigations.
The failure in question is visible at https://github.com/ansible/pylibssh/actions/runs/18421022338/job/52496497562#step:3:332.
@webknjaz Thanks for the report, do you happen to have the wheel that failed to upload on hand?
@webknjaz I did a little poking around and found that the SHA256 of ansible_pylibssh-1.3.0a0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (the wheel that failed) that Sigstore computed during the PyPI upload (cf5939d3918831bdb5178afa5818eedcc154fd202f64d38de7ad281702f50a93) is different than the SHA256 I was able to find in GitHub artifacts (d6ef10b8606c646f7a8c8d9cb3839660c82186f3c1f5eaa155c3fa43b8ab4fa2). Checking another artifact (ansible_pylibssh-1.3.0a0-cp310-cp310-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl) shows that the SHA256 matches between GitHub artifacts and what Sigstore computed.
Is there any steps that modify the ZIP during the publish process? Quite strange.
No extra steps. It just runs cibuildwheel, which of course does auditwheel repair. This is the only thing I can think of that would modify what setuptools produces. Another thing would be GHA's artifact merging mechanism. I've hit race conditions in their artifacts API a couple of times in the past but they never fixed it (https://github.com/actions/download-artifact/issues/140#issuecomment-1314062872 / https://github.com/actions/toolkit/issues/1235).
So my primary theory is that GHA aftifacts got corrupted on download into the publishing job. This is the only explanation I have right now because the TestPyPI job in the same workflow run should be working with exactly the same aftifacts. The PyPI job was being executed a bit later but I don't understand how timing would affect this.
Is it possible to check if TestPyPI received the same wheel byte-to-byte?
Looking at https://test.pypi.org/project/ansible-pylibssh/1.3.0a0/#files, it does look like TestPyPI had the same files uploaded as the ones in GHA artifacts.
Full list of hashes, note that the (1) filenames are from TestPyPI:
7b24c62688d5bb3d35bec06acb6c5bd496936b947d8b8be5f1ec9030c4a975ab ansible_pylibssh-1.3.0a0-cp310-cp310-manylinux_2_27_aarch64.manylinux_2_28_aarch64(1).whl
7b24c62688d5bb3d35bec06acb6c5bd496936b947d8b8be5f1ec9030c4a975ab ansible_pylibssh-1.3.0a0-cp310-cp310-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl
9bc8f02f58cab4a34ac870eb578d02e3c5e53a88120886c25b30198418578f13 ansible_pylibssh-1.3.0a0-cp310-cp310-manylinux_2_27_ppc64le.manylinux_2_28_ppc64le(1).whl
9bc8f02f58cab4a34ac870eb578d02e3c5e53a88120886c25b30198418578f13 ansible_pylibssh-1.3.0a0-cp310-cp310-manylinux_2_27_ppc64le.manylinux_2_28_ppc64le.whl
cef00c730cbe8863a69a63314a4c958431502d4e617ac6a65fa6c10b45fe5d37 ansible_pylibssh-1.3.0a0-cp310-cp310-manylinux_2_27_s390x.manylinux_2_28_s390x(1).whl
cef00c730cbe8863a69a63314a4c958431502d4e617ac6a65fa6c10b45fe5d37 ansible_pylibssh-1.3.0a0-cp310-cp310-manylinux_2_27_s390x.manylinux_2_28_s390x.whl
d6ef10b8606c646f7a8c8d9cb3839660c82186f3c1f5eaa155c3fa43b8ab4fa2 ansible_pylibssh-1.3.0a0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64(1).whl
d6ef10b8606c646f7a8c8d9cb3839660c82186f3c1f5eaa155c3fa43b8ab4fa2 ansible_pylibssh-1.3.0a0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
533286335db65b8298b108fd2c6ad294c8d57df70459509dbd6478f06c18c8f8 ansible_pylibssh-1.3.0a0-cp311-cp311-manylinux_2_27_aarch64.manylinux_2_28_aarch64(1).whl
533286335db65b8298b108fd2c6ad294c8d57df70459509dbd6478f06c18c8f8 ansible_pylibssh-1.3.0a0-cp311-cp311-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl
37ee9e8abfcc8fa4efc8ce4969bb47fbabcacd1d820be4631d9254e3e7c5d4c2 ansible_pylibssh-1.3.0a0-cp311-cp311-manylinux_2_27_ppc64le.manylinux_2_28_ppc64le(1).whl
37ee9e8abfcc8fa4efc8ce4969bb47fbabcacd1d820be4631d9254e3e7c5d4c2 ansible_pylibssh-1.3.0a0-cp311-cp311-manylinux_2_27_ppc64le.manylinux_2_28_ppc64le.whl
dabb6e35add1ed29153315f45723aa0c04d79f9cc0323f778974acc5e5c07cd8 ansible_pylibssh-1.3.0a0-cp311-cp311-manylinux_2_27_s390x.manylinux_2_28_s390x(1).whl
dabb6e35add1ed29153315f45723aa0c04d79f9cc0323f778974acc5e5c07cd8 ansible_pylibssh-1.3.0a0-cp311-cp311-manylinux_2_27_s390x.manylinux_2_28_s390x.whl
a3755e4e4f931fd37ff517b48273764cfcf834679efca6f15eae150692b60449 ansible_pylibssh-1.3.0a0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64(1).whl
a3755e4e4f931fd37ff517b48273764cfcf834679efca6f15eae150692b60449 ansible_pylibssh-1.3.0a0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
a808f7f64b33a3eef2e52a8945bb049e5e629c718872d12bb0b7d627cce701f9 ansible_pylibssh-1.3.0a0-cp312-cp312-manylinux_2_27_aarch64.manylinux_2_28_aarch64(1).whl
a808f7f64b33a3eef2e52a8945bb049e5e629c718872d12bb0b7d627cce701f9 ansible_pylibssh-1.3.0a0-cp312-cp312-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl
9a2367ac544786d9e77a2d57fce578699a46346b260d243ec3d58b36a1b02e07 ansible_pylibssh-1.3.0a0-cp312-cp312-manylinux_2_27_ppc64le.manylinux_2_28_ppc64le(1).whl
9a2367ac544786d9e77a2d57fce578699a46346b260d243ec3d58b36a1b02e07 ansible_pylibssh-1.3.0a0-cp312-cp312-manylinux_2_27_ppc64le.manylinux_2_28_ppc64le.whl
9638fa21a1f45ddf5c48d8ea914be9aa17e703f928123e2342f1838e474d7bf6 ansible_pylibssh-1.3.0a0-cp312-cp312-manylinux_2_27_s390x.manylinux_2_28_s390x(1).whl
9638fa21a1f45ddf5c48d8ea914be9aa17e703f928123e2342f1838e474d7bf6 ansible_pylibssh-1.3.0a0-cp312-cp312-manylinux_2_27_s390x.manylinux_2_28_s390x.whl
ed8dacf55c8c4e9723a05214e7072cdb36b3427ec142b4746221de70685fbc7f ansible_pylibssh-1.3.0a0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64(1).whl
ed8dacf55c8c4e9723a05214e7072cdb36b3427ec142b4746221de70685fbc7f ansible_pylibssh-1.3.0a0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
27460931b0f4302ad1683b461d67660800506d44a3bbed4a634d42d47bc757da ansible_pylibssh-1.3.0a0-cp39-cp39-manylinux_2_27_aarch64.manylinux_2_28_aarch64(1).whl
27460931b0f4302ad1683b461d67660800506d44a3bbed4a634d42d47bc757da ansible_pylibssh-1.3.0a0-cp39-cp39-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl
21b43246c291e713aeae01bc626e494177dba1ebace3d94eab8d632338ba58ef ansible_pylibssh-1.3.0a0-cp39-cp39-manylinux_2_27_ppc64le.manylinux_2_28_ppc64le(1).whl
21b43246c291e713aeae01bc626e494177dba1ebace3d94eab8d632338ba58ef ansible_pylibssh-1.3.0a0-cp39-cp39-manylinux_2_27_ppc64le.manylinux_2_28_ppc64le.whl
10237cd4aa3ccb3198525fbfa45e17becd5c4a9b1158773e96918cce5ca865f4 ansible_pylibssh-1.3.0a0-cp39-cp39-manylinux_2_27_s390x.manylinux_2_28_s390x(1).whl
10237cd4aa3ccb3198525fbfa45e17becd5c4a9b1158773e96918cce5ca865f4 ansible_pylibssh-1.3.0a0-cp39-cp39-manylinux_2_27_s390x.manylinux_2_28_s390x.whl
cd40c7bd955e5e33f275578baf4c721c59fa2a4e2a0cfabd8c2686df018459d0 ansible_pylibssh-1.3.0a0-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64(1).whl
cd40c7bd955e5e33f275578baf4c721c59fa2a4e2a0cfabd8c2686df018459d0 ansible_pylibssh-1.3.0a0-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
~@sethmlarson where did you get the (1) files? Are these duplicates within GHA artifact?~ Oh, never mind. I see you've put both index uploads side-by-side, right?
@webknjaz Yeah they're side-by-side, basically hashes from GHA artifacts and TestPyPI match.
I guess, this issue can be reclassified from bug to UX. And hopefully getting cibuildwheel print out the hashes + publishing jobs doing the same should be able to make this kind of artifact corruption at least observable...
في الجمعة، ١٧ أكتوبر ٢٠٢٥, ١١:٤١ م 🇺🇦 Sviatoslav Sydorenko (Святослав Сидоренко) @.***> كتب:
webknjaz left a comment (pypi/warehouse#18848) https://github.com/pypi/warehouse/issues/18848#issuecomment-3417086715
I guess, this issue can be reclassified from bug to UX. And hopefully getting cibuildwheel print out the hashes + publishing jobs doing the same should be able to make this kind of artifact corruption at least observable...
— Reply to this email directly, view it on GitHub https://github.com/pypi/warehouse/issues/18848#issuecomment-3417086715, or unsubscribe https://github.com/notifications/unsubscribe-auth/BV2AM5LUIN6YOQYRMUP2H6D3YFIBHAVCNFSM6AAAAACI5WFZ2CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTIMJXGA4DMNZRGU . You are receiving this because you are subscribed to this thread.Message ID: @.***>
I poked around at this a bit, and I wonder if this was maybe an unzip discrepancy -- auditwheel used to shell out to unzip instead of using the zipfile module, and it seems not unlikely that unzip could produce junk trailing data at the end of archives.
For context, it looks like auditwheel switched to zipfile in 5.0.0 in 2021: https://github.com/pypa/auditwheel/issues/327
@woodruffw I don't think that's possible since both jobs received wheels from the same source. I blame GHA's race conditions..
Ah yeah, if there's no variation in the source it seems unlikely. It's very concerning that GHA could race in this way 😬