warehouse
warehouse copied to clipboard
Verify release URLs using Trusted Publisher information
This PR builds on top of the work in https://github.com/pypi/warehouse/pull/15891. The goal is to:
- Add a new
verifiedboolean field to eachReleaseURLthat is set toTrueif the URL matches the Trusted Publisher URL- For example, a Source URL of
github.com/user/projectshould appear as verified if the release was uploaded using a GitHub Actions Trusted Publisher configured withgithub.com/user/projectas its identity.
- For example, a Source URL of
- Verify the URLs of a release when a new file is uploaded
- [x] TODO: The current implementation only marks a URL as verified during the first file upload of a release (when the release is created). Change this so that subsequent uploads can also mark URLs as verified.
@di @woodruffw
TODO: The current implementation only marks a URL as verified during the first file upload of a release (when the release is created). Change this so that subsequent uploads can also mark URLs as verified.
I think this is probably fine as-is for now, as it will cover the majority of use cases, but once we support verifying URLs when attestations are uploaded we should revisit this to re-verify URLs when a file is added to a release (and, likely, add any additional URLs that were not present on the first file, if there are any).
once we support verifying URLs when attestations are uploaded
~I thought the URL verification would always be via Trusted Publisher, how do the attestations fit here?~ From a conversation: We'll also verify URLs when uploading non-publish attestations that are not necessarily uploaded via Trusted Publishing
In the future we might have attestations that are not related to a publish that could add additional verification, but I think this is an edge case we can consider later. If we want to include updates here to handle verification for subsequent file uploads that would be fine too!
@di For verifying URLs of existing releases (when the user uploads files that are not the first file of that release), do we want to verify the URLs using the Trusted Publisher iff the file is successfully uploaded?
The alternative would be to also allow marking URLs as verified for any Trusted Publishing upload attempt that is correctly authenticated, even if the file ends up not being successfully uploaded.
I think only a successful upload makes sense. There are many reasons why an upload would fail, I don't think it's necessary to tease apart which ones would still make verification valid and which ones might not.
I pushed a commit for verifying also URLs of existing releases. Now, if a file is uploaded for an existing release, we update the verified status of the release's URLs if they were not verified but the current file upload did verify them.
This changeset appears to have introduced some warnings in the test suite:
tests/unit/forklift/test_legacy.py::TestFileUpload::test_new_release_url_verified[https://github.com/foo-False]
tests/unit/forklift/test_legacy.py::TestFileUpload::test_new_release_url_verified[https://github.com/foo/bar/-True]
tests/unit/forklift/test_legacy.py::TestFileUpload::test_new_release_url_verified[https://github.com/foo/bar-True]
/opt/warehouse/src/tests/unit/forklift/test_legacy.py:3906: SAWarning: SELECT statement has a cartesian product between FROM element(s) "releases" and FROM element "release_urls". Apply join condition(s) between each element to resolve.
db_request.db.query(ReleaseURL).filter(Release.project == project).one()
tests/unit/forklift/test_legacy.py::TestFileUpload::test_new_release_url_verified[https://google.com-False]
/opt/warehouse/src/tests/unit/forklift/test_legacy.py:3906: SAWarning: SELECT statement has a cartesian product between FROM element(s) "release_urls" and FROM element "releases". Apply join condition(s) between each element to resolve.
db_request.db.query(ReleaseURL).filter(Release.project == project).one()
tests/unit/forklift/test_legacy.py::TestFileUpload::test_new_publisher_verifies_existing_release_url
/opt/warehouse/src/tests/unit/forklift/test_legacy.py:3984: SAWarning: SELECT statement has a cartesian product between FROM element(s) "releases" and FROM element "release_urls". Apply join condition(s) between each element to resolve.
db_request.db.query(ReleaseURL).filter(Release.project == project).all()
Curious - why would these query not leverage the relationship?
https://github.com/pypi/warehouse/blob/8a5ad08a1bf65b769efaa97ad0f49b7739a20857/tests/unit/forklift/test_legacy.py#L3906 https://github.com/pypi/warehouse/blob/8a5ad08a1bf65b769efaa97ad0f49b7739a20857/tests/unit/forklift/test_legacy.py#L3984
This changeset appears to have introduced some warnings in the test suite:
Looking into it now
This changeset appears to have introduced some warnings in the test suite:
PR open to fix it here: https://github.com/pypi/warehouse/pull/16528
PR to prevent recurrence here: https://github.com/pypi/warehouse/pull/16529