uv icon indicating copy to clipboard operation
uv copied to clipboard

`uv pip install` returning 403 from private pypi cloud instance backed by s3

Open philiplinden opened this issue 1 year ago • 20 comments

I am using a private pypi cloud instance backed by s3 (with no auth on my end). Public packages are resolved normally, but uv pip cannot resolve packages hosted on the private cloud instance.

  • pip3 install my-private-package --index-url https://my-pip-instance.example.com/ succeeded with no problems.
  • Running uv pip install with the --no-cache option did not change the result.
  • I can retrieve the artifact from the url that uv pip was trying to fetch. This URL returns a 302 to the url that is giving me a 403
$ uv pip install my-private-package --index-url https://my-pip-instance.example.com/

error: Failed to build editables
  Caused by: Failed to build editable: file:///Users/phil/repos/philiplinden/scratch
  Caused by: Failed to install requirements from build-system.requires (resolve)
  Caused by: No solution found when resolving: setuptools
  Caused by: Failed to download: setuptools==69.1.1
  Caused by: HTTP status client error (403 Forbidden) for url (https://s3-url.amazonaws.com/example/setuptools-69.1.1-py3-none-any.whl?AWSAccessKeyId=xxx&Signature=xxx%3D&Expires=1709153725](https://s3-url.amazonaws.com/ffdf/setuptools/setuptools-69.1.1-py3-none-any.whl?AWSAccessKeyId=xxx&Signature=xxx%3D&Expires=1709153725)))

Relates to https://github.com/astral-sh/uv/issues/1709 and https://github.com/astral-sh/uv/pull/1902

Version: 0.1.6, 0.1.11

Verbose output (anonymized)

 uv_client::flat_index::from_entries 
 uv_installer::downloader::build_editables 
      0.355282s   0ms DEBUG uv_distribution::source Building (editable) file:///Users/phil/repos/philiplinden/scratch
   uv_dispatch::setup_build package_id="file:///Users/phil/repos/philiplinden/scratch", subdirectory=None
     uv_resolver::resolver::solve 
          0.361346s   0ms DEBUG uv_resolver::resolver Solving with target Python version 3.11.6
       uv_resolver::resolver::choose_version package=root
       uv_resolver::resolver::get_dependencies package=root, version=0a0.dev0
            0.361500s   0ms DEBUG uv_resolver::resolver Adding direct dependency: setuptools*
       uv_resolver::resolver::choose_version package=setuptools
         uv_resolver::resolver::package_wait package_name=setuptools
     uv_resolver::resolver::process_request request=Versions setuptools
       uv_client::registry_client::simple_api package=setuptools
         uv_client::cached_client::get_cacheable 
           uv_client::cached_client::read_and_parse_cache file=/Users/phil/Library/Caches/uv/simple-v1/8aba338bd0495f93/setuptools.rkyv
     uv_resolver::resolver::process_request request=Prefetch setuptools *
              0.366930s   5ms DEBUG uv_client::cached_client Found stale response for: https://my-pip-instance.example.com/simple/setuptools/
              0.366959s   5ms DEBUG uv_client::cached_client Sending revalidation request for: https://my-pip-instance.example.com/setuptools/
           uv_client::cached_client::revalidation_request url="https://my-pip-instance.example.com/setuptools/"
              1.016439s 654ms DEBUG uv_client::cached_client Found modified response for: https://my-pip-instance.example.com/simple/setuptools/
           uv_client::cached_client::new_cache file=/Users/phil/Library/Caches/uv/simple-v1/8aba338bd0495f93/setuptools.rkyv
           uv_client::registry_client::parse_simple_api package=setuptools
             uv_client::html::parse url=https://my-pip-instance.example.com/setuptools/
 uv_resolver::version_map::from_metadata 
       uv_distribution::distribution_database::get_or_build_wheel_metadata dist=setuptools==69.1.1
         uv_client::registry_client::wheel_metadata built_dist=setuptools==69.1.1
           uv_client::cached_client::get_serde 
             uv_client::cached_client::get_cacheable 
               uv_client::cached_client::read_and_parse_cache file=/Users/phil/Library/Caches/uv/wheels-v0/index/8aba338bd0495f93/setuptools/setuptools-69.1.1-py3-none-any.msgpack
            1.322675s 961ms DEBUG uv_resolver::resolver Searching for a compatible version of setuptools (*)
            1.322694s 961ms DEBUG uv_resolver::resolver Selecting: setuptools==69.1.1 (setuptools-69.1.1-py3-none-any.whl)
       uv_resolver::resolver::get_dependencies package=setuptools, version=69.1.1
         uv_resolver::resolver::distributions_wait package_id=setuptools-69.1.1
                  1.322754s   0ms DEBUG uv_client::cached_client No cache entry for: https://my-pip-instance.example.com/api/package/setuptools/setuptools-69.1.1-py3-none-any.whl#sha256=02fa291a0471b3a18b2b2481ed902af520c69e8ae0919c13da936542754b4c56
               uv_client::cached_client::fresh_request url="https://my-pip-instance.example.com/api/package/setuptools/setuptools-69.1.1-py3-none-any.whl#sha256=02fa291a0471b3a18b2b2481ed902af520c69e8ae0919c13da936542754b4c56"
error: Failed to build editables
  Caused by: Failed to build editable: file:///Users/phil/repos/philiplinden/scratch
  Caused by: Failed to install requirements from build-system.requires (resolve)
  Caused by: No solution found when resolving: setuptools
  Caused by: Failed to download: setuptools==69.1.1
  Caused by: HTTP status client error (403 Forbidden) for url (https://my-pip-instance.amazonaws.com/ffdf/setuptools/setuptools-69.1.1-py3-none-any.whl?AWSAccessKeyId=xxx&Signature=xxx&Expires=1709157923)

philiplinden avatar Feb 27 '24 22:02 philiplinden

Do you mind updating to v0.1.11? v0.1.6 is a few versions out-of-date.

charliermarsh avatar Feb 27 '24 22:02 charliermarsh

Thanks, I just updated to v0.1.11 and now the installer hangs at the same spot for a few seconds before throwing the same 403 error.

philiplinden avatar Feb 27 '24 23:02 philiplinden

Can you say a bit more about how the auth is intended to work? The URL is publicly available, and redirects you to S3 URLs with credentials embedded?

charliermarsh avatar Feb 27 '24 23:02 charliermarsh

Can you say a bit more about how the auth is intended to work? The URL is publicly available, and redirects you to S3 URLs with credentials embedded?

Yeah that's correct. It uses s3 one time preauthed urls. I am using pypicloud with redirect_urls enabled

philiplinden avatar Feb 28 '24 00:02 philiplinden

This also occurs when pip compiling from gemfury links, the package lookup on gemfury works, but downloading packages fails since they are backed by s3. This is the response from S3 when opening one of those preauthed links:

Code: SignatureDoesNotMatch Message: The request signature we calculated does not match the signature you provided. Check your key and signing method.

amarckal avatar Mar 25 '24 12:03 amarckal

Hi. I probably don't understand half of the code in this repo, but after experimenting with uv pip install -vv locally, I think I might have found a clue:

https://github.com/astral-sh/uv/blob/661787b0cbb542f2e8e3bb4fbcc834a42af59f4b/crates/uv-client/src/registry_client.rs#L512-L515

Here, I believe req never has auth headers attached, because it's only when the request is executed the AuthMiddleware runs and attaches the auth header to req. Iow, I believe the headers are extracted too early in the linked code. Not sure how to fix it though 🤷🏻‍♂️

torarvid avatar Apr 04 '24 14:04 torarvid

That should be okay though, since the subsequent requests will also go through the auth middleware and get the appropriate headers attached.

Were you able to reproduce this issue? What does your setup look like?

charliermarsh avatar Apr 04 '24 17:04 charliermarsh

Yep, you're right, @charliermarsh. I just tried hard-coding in my credentials at that point in the code, and then I got a "Request already has an authorization header" error instead. My clue was not a clue after all 😢

But yes, I can reproduce. I believe I have the same issue as @amarckal, which is that when I use curl against pypi.fury.io to fetch my private package, I get a 302 redirect to a url like this:

https://s3.amazonaws.com/gemfury/gems/<redacted-path>/sdxp_0_7_3_py3_none_any_whl?x-acct=<redacted-acct>&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=<redacted-cred>%2F20240404%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20240404T180859Z&X-Amz-Expires=900&X-Amz-SignedHeaders=host&X-Amz-Signature=<redacted-sig>

(I put <redacted-[whatever]> in places that could be sensitive).

When I run ./target/debug/uv pip install -vv --no-cache --extra-index-url https://{$GEMFURY_READ_TOKEN}@pypi.fury.io/oda/ sdxp==0.7.3, the last part of the output is (with some <redacted-[whatevers]> here as well:

           uv_client::cached_client::read_and_parse_cache file=/private/var/folders/2k/by15c_cs40l9swcck47g6lpm0000gn/T/.tmplRVtUe/wheels-v0/index/5f5b51aad86993d4/sdxp/sdxp-0.7.3-py3-none-any.msgpack
 uv_client::cached_client::from_path_sync path="/private/var/folders/2k/by15c_cs40l9swcck47g6lpm0000gn/T/.tmplRVtUe/wheels-v0/index/5f5b51aad86993d4/sdxp/sdxp-0.7.3-py3-none-any.msgpack"
                1.665639s   0ms TRACE uv_client::cached_client No cache entry exists for /private/var/folders/2k/by15c_cs40l9swcck47g6lpm0000gn/T/.tmplRVtUe/wheels-v0/index/5f5b51aad86993d4/sdxp/sdxp-0.7.3-py3-none-any.msgpack
              1.665827s   1ms DEBUG uv_client::cached_client No cache entry for: https://pypi.fury.io/oda/-/ver_Fz9wq/sdxp-0.7.3-py3-none-any.whl#sha256=869326637eef5de7d4312b82ecf9ba85fcd9e038273d54c0bbe7602d3b8529ad
           uv_client::cached_client::fresh_request url="https://pypi.fury.io/oda/-/ver_Fz9wq/sdxp-0.7.3-py3-none-any.whl#sha256=869326637eef5de7d4312b82ecf9ba85fcd9e038273d54c0bbe7602d3b8529ad"
                1.666044s   0ms TRACE uv_client::cached_client Sending fresh HEAD request for https://pypi.fury.io/oda/-/ver_Fz9wq/sdxp-0.7.3-py3-none-any.whl#sha256=869326637eef5de7d4312b82ecf9ba85fcd9e038273d54c0bbe7602d3b8529ad
                1.666269s   0ms DEBUG uv_auth::middleware Adding authentication to already-seen URL: https://pypi.fury.io/oda/-/ver_Fz9wq/sdxp-0.7.3-py3-none-any.whl#sha256=869326637eef5de7d4312b82ecf9ba85fcd9e038273d54c0bbe7602d3b8529ad
                1.895293s 229ms TRACE uv_client::httpcache cached request https://pypi.fury.io/oda/-/ver_Fz9wq/sdxp-0.7.3-py3-none-any.whl#sha256=869326637eef5de7d4312b82ecf9ba85fcd9e038273d54c0bbe7602d3b8529ad is storable because its response has a heuristically cacheable status code 200
           uv_client::cached_client::new_cache file=/private/var/folders/2k/by15c_cs40l9swcck47g6lpm0000gn/T/.tmplRVtUe/wheels-v0/index/5f5b51aad86993d4/sdxp/sdxp-0.7.3-py3-none-any.msgpack
           uv_client::registry_client::read_metadata_range_request wheel=sdxp-0.7.3-py3-none-any.whl
                1.896366s   0ms TRACE uv_client::registry_client Getting metadata for sdxp-0.7.3-py3-none-any.whl by range request
    1.897265s DEBUG uv_auth::middleware No credentials found for: https://s3.amazonaws.com/gemfury/gems/<redacted-path>/sdxp_0_7_3_py3_none_any_whl?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=<redacted-cred>%2F20240404%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20240404T181433Z&X-Amz-Expires=900&X-Amz-SignedHeaders=host&X-Amz-Signature=<redacted-sig>
error: Failed to download: sdxp==0.7.3
  Caused by: Failed to unzip wheel: sdxp-0.7.3-py3-none-any.whl
  Caused by: an upstream reader returned an error: io error occurred: Request error: HTTP status client error (403 Forbidden) for url (https://s3.amazonaws.com/gemfury/gems/<redacted-path>/sdxp_0_7_3_py3_none_any_whl?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=<redacted-cred>%2F20240404%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20240404T181433Z&X-Amz-Expires=900&X-Amz-SignedHeaders=host&X-Amz-Signature=<redacted-sig>)
  Caused by: io error occurred: Request error: HTTP status client error (403 Forbidden) for url (<the-same-url-again>)
  Caused by: Request error: HTTP status client error (403 Forbidden) for url (<the-same-url-again>)
  Caused by: HTTP status client error (403 Forbidden) for url (<the-same-url-again>)

If I isolate just the url in this output, there's another possible clue:

https://s3.amazonaws.com/gemfury/gems/<redacted-path>/sdxp_0_7_3_py3_none_any_whl?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=<redacted-cred>%2F20240404%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20240404T181433Z&X-Amz-Expires=900&X-Amz-SignedHeaders=host&X-Amz-Signature=<redacted-sig> (<-- this is the uv one)

compared with the curl one from above:

https://s3.amazonaws.com/gemfury/gems/<redacted-path>/sdxp_0_7_3_py3_none_any_whl?x-acct=<redacted-acct>&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=<redacted-cred>%2F20240404%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20240404T180859Z&X-Amz-Expires=900&X-Amz-SignedHeaders=host&X-Amz-Signature=<redacted-sig> (<-- this is the curl one)

The curl url has a x-acct=<stuff> query param, while the uv one doesn't. I have no idea why...

torarvid avatar Apr 04 '24 18:04 torarvid

Wow thanks for the sleuthing. I have no idea why that would be dropped either.

zanieb avatar Apr 04 '24 18:04 zanieb

Do you know if you did anything special to get Gemfury to run against S3? My Gemfury URLs don't look like that so I've had trouble reproducing.

charliermarsh avatar Apr 04 '24 18:04 charliermarsh

I pushed a branch with an extra log if you want to give it a try: https://github.com/astral-sh/uv/pull/2823

I'm trying to narrow down when that part is dropped from the URL.

zanieb avatar Apr 04 '24 18:04 zanieb

We are experiencing the same issue. I wonder if it's because we have an older Gemfury account. This option is enabled under our organization settings:

Screen Shot 2024-04-04 at 14 43 23

I don't remember opting into that explicitly. Is that disabled in your account?

EDIT: Disabling this option changed the package source to a https://gemfury.s3-accelerate.dualstack.amazonaws.com/gems/... CDN URL instead of S3, but I still get the same error.

benwebber avatar Apr 04 '24 18:04 benwebber

I can try enabling that and then uploading a new package.

charliermarsh avatar Apr 04 '24 18:04 charliermarsh

Sadly it's still giving me URLs like https://pypi.fury.io/charliermarsh/-/ver_mt7Ge/gemfury-test-0.0.1.tar.gz.

charliermarsh avatar Apr 04 '24 18:04 charliermarsh

Without knowing, I am sure that for our Gemfury account, I have at least one package that works fine (it does not redirect to S3) and then at least this one here that fails (because it does redirect to S3).

I'm not sure what causes this difference in behavior. The one that fails for me was created by running poetry publish -r oda -u <secret> -p NOPASS, so it's possible poetry makes it "do S3 magic"? (The package that works fine is made by another team. At this point I have no idea how it was created/published)

torarvid avatar Apr 04 '24 18:04 torarvid

I emailed Gemfury.

charliermarsh avatar Apr 04 '24 19:04 charliermarsh

Perhaps I can get setup with one of these S3-back indexes, or they can tell me what I'm doing wrong.

charliermarsh avatar Apr 04 '24 19:04 charliermarsh

Ok, I've said "I think I found a clue!" before and been wrong, but I persist: I think I might have found a clue! 😆

I made this test program:

use std::error::Error;

use reqwest::redirect::Policy;

use reqwest::Client;

#[tokio::main]
pub async fn main() -> Result<(), Box<dyn Error>> {
    let url = "https://pypi.fury.io/oda/-/ver_Fz9wq/sdxp-0.7.3-py3-none-any.whl";
    let url: reqwest::Url = url.parse().unwrap();
    let client = Client::builder()
        // Ensure that we *don't* follow redirects for this example
        .redirect(Policy::none())
        // fake being curl in case it matters (i don't think so)
        .user_agent("curl/8.4.0")
        .build()?;
    let req = client
        .head(url)
        .header("authorization", "Basic <redacted>")
        .header("accept", "*/*")
        .build()?;
    println!("111 {:?}", req);
    let head_response = client.execute(req).await?;
    println!("222 {:?}", head_response);
    let location = head_response.headers().get("location").unwrap();
    println!("333 {:?}", location);
    Ok(())
}

And I get this output (with redactions):

111 Request { method: HEAD, url: Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("pypi.fury.io")), port: None, path: "/oda/-/ver_Fz9wq/sdxp-0.7.3-py3-none-any.whl", query: None, fragment: None }, headers: {"authorization": "Basic <redacted>", "accept": "*/*"} }
222 Response { url: Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("pypi.fury.io")), port: None, path: "/oda/-/ver_Fz9wq/sdxp-0.7.3-py3-none-any.whl", query: None, fragment: None }, status: 302, headers: {"server": "Cowboy", "report-to": "{\"group\":\"heroku-nel\",\"max_age\":3600,\"endpoints\":[{\"url\":\"https://nel.heroku.com/reports?ts=1712306018&sid=929419e7-33ea-4e2f-85f0-7d8b7cd5cbd6&s=KrA%2BOTdymL0CqPAad5loOyUNMC4BcBEE%2BWCTsxIYvkA%3D\"}]}", "reporting-endpoints": "heroku-nel=https://nel.heroku.com/reports?ts=1712306018&sid=929419e7-33ea-4e2f-85f0-7d8b7cd5cbd6&s=KrA%2BOTdymL0CqPAad5loOyUNMC4BcBEE%2BWCTsxIYvkA%3D", "nel": "{\"report_to\":\"heroku-nel\",\"max_age\":3600,\"success_fraction\":0.005,\"failure_fraction\":0.05,\"response_headers\":[\"Via\"]}", "connection": "keep-alive", "content-type": "text/html; charset=utf-8", "location": "https://s3.amazonaws.com/gemfury/gems/<redacted-path>/sdxp_0_7_3_py3_none_any_whl?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=<redacted-cred>%2F20240405%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20240405T083338Z&X-Amz-Expires=900&X-Amz-SignedHeaders=host&X-Amz-Signature=<redacted-sig>", "vary": "Accept-Encoding", "date": "Fri, 05 Apr 2024 08:33:38 GMT", "via": "1.1 vegur"} }
333 "https://s3.amazonaws.com/gemfury/gems/<redacted-path>/sdxp_0_7_3_py3_none_any_whl?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=<redacted-cred>%2F20240405%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20240405T083338Z&X-Amz-Expires=900&X-Amz-SignedHeaders=host&X-Amz-Signature=<redacted-sig>"

So: if I curl -i <the-url-from-the-333-line> I again get the error from AWS: The request signature we calculated does not match the signature you provided. Check your key and signing method.

But if I change the curl-line to curl -i --head <the-url-from-the-333-line> it works.

Next, I change my Rust test program from .head(url) to .get(url), I then take the url that it spits out. Now the url works with curl -i and fails with curl -i --head.

So finally my clue becomes: The HTTP method is encoded in the signature for the S3 urls, so when you pass that S3 url to AsyncHttpRangeReader::from_head_response it won't work. The S3 url would only work with a HEAD request, but AsyncHttpRangeReader::from_head_response uses GET internally. 🤯

Am I right? I need all of your 🧠s to sanity check my logic 😛

torarvid avatar Apr 05 '24 09:04 torarvid

Other people have had this issue: https://stackoverflow.com/questions/15717230/pre-signing-amazon-s3-urls-for-both-head-and-get-verbs

torarvid avatar Apr 05 '24 09:04 torarvid

I made a proof-of-concept PR to work around the issue. I do that by passing a modified response to the range reader so that it uses the "original" (gemfury) link and not the 302-redirected S3 link. Works for my local repro test case 😄

torarvid avatar Apr 05 '24 22:04 torarvid

I have the same issue with pypicloud, and @torarvid 's PR did not work.

❯ cargo install --git https://github.com/torarvid/uv.git --rev 6d89c85 uv
❯ uv pip compile pyproject.toml -o requirements.txt --index-url http://internal-pypi:8080/simple/
error: Failed to download: internal-pkg==0.1.1054928
  Caused by: HTTP status client error (403 Forbidden) for url (https://bucket-name.s3.amazonaws.com/pypi10c6/internal-pkg/internal-pkg-0.1.1054928-py3-none-any.whl
    ?Signature=%2FhTEgX6psBoSyuCM9F4BiwpCEbw%3D
    &Expires=1870939460
    &AWSAccessKeyId=FEWNCEFY
    &x-amz-security-token=TP//////////ARNU/OkGQV/8AwxoYm)

If I manually curl the printed URL, GET works but HEAD gets 403.

curl -vvv https://bucket-name.s3..  # 200
curl -vvv -X HEAD https://bucket-name.s3..  # 403 Forbidden

elbaro avatar Apr 16 '24 09:04 elbaro

For pypicloud workaround, I added 403 Forbidden to https://github.com/astral-sh/uv/pull/2186/files and it worked.

elbaro avatar Apr 16 '24 17:04 elbaro

Interesting, ok, we can add that. Do you want to submit a PR?

charliermarsh avatar Apr 16 '24 17:04 charliermarsh

@charliermarsh I re-opened as I do not think that addresses all of the cases here.

zanieb avatar Apr 16 '24 18:04 zanieb

Thanks, sorry, I didn't mean to close this.

charliermarsh avatar Apr 16 '24 18:04 charliermarsh

If anyone is willing to test https://github.com/astral-sh/uv/pull/3460 I would appreciate it.

charliermarsh avatar May 08 '24 14:05 charliermarsh