warehouse icon indicating copy to clipboard operation
warehouse copied to clipboard

Verify release URLs using Trusted Publisher information

Open facutuesca opened this issue 1 year ago • 6 comments

This PR builds on top of the work in https://github.com/pypi/warehouse/pull/15891. The goal is to:

  • Add a new verified boolean field to each ReleaseURL that is set to True if the URL matches the Trusted Publisher URL
    • For example, a Source URL of github.com/user/project should appear as verified if the release was uploaded using a GitHub Actions Trusted Publisher configured with github.com/user/project as its identity.
  • Verify the URLs of a release when a new file is uploaded
    • [x] TODO: The current implementation only marks a URL as verified during the first file upload of a release (when the release is created). Change this so that subsequent uploads can also mark URLs as verified.

@di @woodruffw

facutuesca avatar Jul 02 '24 13:07 facutuesca

TODO: The current implementation only marks a URL as verified during the first file upload of a release (when the release is created). Change this so that subsequent uploads can also mark URLs as verified.

I think this is probably fine as-is for now, as it will cover the majority of use cases, but once we support verifying URLs when attestations are uploaded we should revisit this to re-verify URLs when a file is added to a release (and, likely, add any additional URLs that were not present on the first file, if there are any).

di avatar Jul 02 '24 16:07 di

once we support verifying URLs when attestations are uploaded

~I thought the URL verification would always be via Trusted Publisher, how do the attestations fit here?~ From a conversation: We'll also verify URLs when uploading non-publish attestations that are not necessarily uploaded via Trusted Publishing

facutuesca avatar Jul 02 '24 16:07 facutuesca

In the future we might have attestations that are not related to a publish that could add additional verification, but I think this is an edge case we can consider later. If we want to include updates here to handle verification for subsequent file uploads that would be fine too!

di avatar Jul 02 '24 16:07 di

@di For verifying URLs of existing releases (when the user uploads files that are not the first file of that release), do we want to verify the URLs using the Trusted Publisher iff the file is successfully uploaded?

The alternative would be to also allow marking URLs as verified for any Trusted Publishing upload attempt that is correctly authenticated, even if the file ends up not being successfully uploaded.

facutuesca avatar Jul 08 '24 16:07 facutuesca

I think only a successful upload makes sense. There are many reasons why an upload would fail, I don't think it's necessary to tease apart which ones would still make verification valid and which ones might not.

di avatar Jul 08 '24 17:07 di

I pushed a commit for verifying also URLs of existing releases. Now, if a file is uploaded for an existing release, we update the verified status of the release's URLs if they were not verified but the current file upload did verify them.

facutuesca avatar Jul 19 '24 16:07 facutuesca

This changeset appears to have introduced some warnings in the test suite:

tests/unit/forklift/test_legacy.py::TestFileUpload::test_new_release_url_verified[https://github.com/foo-False]
tests/unit/forklift/test_legacy.py::TestFileUpload::test_new_release_url_verified[https://github.com/foo/bar/-True]
tests/unit/forklift/test_legacy.py::TestFileUpload::test_new_release_url_verified[https://github.com/foo/bar-True]
  /opt/warehouse/src/tests/unit/forklift/test_legacy.py:3906: SAWarning: SELECT statement has a cartesian product between FROM element(s) "releases" and FROM element "release_urls".  Apply join condition(s) between each element to resolve.
    db_request.db.query(ReleaseURL).filter(Release.project == project).one()

tests/unit/forklift/test_legacy.py::TestFileUpload::test_new_release_url_verified[https://google.com-False]
  /opt/warehouse/src/tests/unit/forklift/test_legacy.py:3906: SAWarning: SELECT statement has a cartesian product between FROM element(s) "release_urls" and FROM element "releases".  Apply join condition(s) between each element to resolve.
    db_request.db.query(ReleaseURL).filter(Release.project == project).one()

tests/unit/forklift/test_legacy.py::TestFileUpload::test_new_publisher_verifies_existing_release_url
  /opt/warehouse/src/tests/unit/forklift/test_legacy.py:3984: SAWarning: SELECT statement has a cartesian product between FROM element(s) "releases" and FROM element "release_urls".  Apply join condition(s) between each element to resolve.
    db_request.db.query(ReleaseURL).filter(Release.project == project).all()

Curious - why would these query not leverage the relationship?

https://github.com/pypi/warehouse/blob/8a5ad08a1bf65b769efaa97ad0f49b7739a20857/tests/unit/forklift/test_legacy.py#L3906 https://github.com/pypi/warehouse/blob/8a5ad08a1bf65b769efaa97ad0f49b7739a20857/tests/unit/forklift/test_legacy.py#L3984

miketheman avatar Aug 20 '24 16:08 miketheman

This changeset appears to have introduced some warnings in the test suite:

Looking into it now

facutuesca avatar Aug 20 '24 17:08 facutuesca

This changeset appears to have introduced some warnings in the test suite:

PR open to fix it here: https://github.com/pypi/warehouse/pull/16528

facutuesca avatar Aug 20 '24 17:08 facutuesca

PR to prevent recurrence here: https://github.com/pypi/warehouse/pull/16529

twm avatar Aug 20 '24 18:08 twm