uv icon indicating copy to clipboard operation
uv copied to clipboard

`uv pip install` with Nexus give "Missing 'Content-Type"` error

Open kendallbailey opened this issue 1 year ago • 10 comments

uv version is 0.1.5 on Ubuntu 20.04, python 3.10

With UV_INDEX_URL pointing to a private Sonatype Nexus service acting as a pypi proxy:

$ uv pip install gcsfs
error: Missing `Content-Type` header for https://****:****@nexus.example.com/repository/pypi-group/simple/gcsfs/

Hitting the above URL with curl -v shows a content type header of "text/html; charset=UTF-8" is returned

$ curl -o x -v https://******:******@nexus.example.com/repository/pypi-group/simple/gcsfs/
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying x.x.x.x:443...
* TCP_NODELAY set
* Connected to nexus.example.com (x.x.x.x) port 443 (#0)
<snip>
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
} [5 bytes data]
* Server auth using Basic with user '*****'
* Using Stream ID: 1 (easy handle 0x55a8e6d03680)
} [5 bytes data]
> GET /repository/pypi-group/simple/gcsfs/ HTTP/2
> Host: nexus.example.com
> authorization: Basic ****************** 
> user-agent: curl/7.68.0
> accept: */*
> 
<snip>
< HTTP/2 200 
< content-type: text/html; charset=UTF-8
< content-length: 24690
< date: Tue, 20 Feb 2024 14:11:26 GMT
< server: Nexus/3.48.0-01 (OSS)
< x-content-type-options: nosniff
< content-security-policy: sandbox allow-forms allow-modals allow-popups allow-presentation allow-scripts allow-top-navigation
< x-xss-protection: 1; mode=block
< last-modified: Tue, 20 Feb 2024 13:59:16 GMT
< 
{ [16048 bytes data]
100 24690  100 24690    0     0  84554      0 --:--:-- --:--:-- --:--:-- 84845
* Connection #0 to host nexus.example.com left intact

Last week I tested uv v0.1.2 in the same environment and it rejected md5 hashes, but pandas was one package that had no issue. Now with v0.1.5 pandas and many others have the Content-Type issue but there are still some packages that will install via Nexus, such as aiohttp. I can't detect any difference in the HTTP responses from Nexus between the packages that work and those that don't. All have Content-Type of "text/html; charset=UTF-8". Comparing against http://pypi.org/simple/<package>/, it appears pypi sets the Content-Type header to "text/html" (no charset specified).

kendallbailey avatar Feb 20 '24 14:02 kendallbailey

Thanks, will take a look! That error is thrown when no header is present at all -- it's not from parsing the content type -- so not immediately sure what's up on first glance.

charliermarsh avatar Feb 20 '24 14:02 charliermarsh

Update: I tried uv -v pip install gcsfs and it talked about revalidating the cache before again hitting the Content-Type error. So I deleted the cache entirely, rm -rf ~/.cache/uv. Then it was able to install gcsfs without error. It now works to install my application package with 251 dependencies. Seems like uv v0.1.5 was having a problem caused by the cache created by v0.1.2.

Since deleting the cache I haven't been able to reproduce the Content-Type error.

kendallbailey avatar Feb 20 '24 22:02 kendallbailey

output from uv -v

$ uv -v pip install nvidia-curand-cu12
 uv::requirements::from_source source=nvidia-curand-cu12
    0.001079s DEBUG uv_interpreter::virtual_env Found a virtualenv through VIRTUAL_ENV at: /home/kbailey/venvs/py310/dmnet-test
    0.001240s DEBUG uv_interpreter::interpreter Using cached markers for: /home/kbailey/venvs/py310/dmnet-test/bin/python
    0.001254s DEBUG uv::commands::pip_install Using Python 3.10.13 environment at /home/kbailey/venvs/py310/dmnet-test/bin/python
 uv_client::flat_index::from_entries 
 uv_resolver::resolver::solve 
      0.005638s   0ms DEBUG uv_resolver::resolver Solving with target Python version 3.10.13
   uv_resolver::resolver::choose_version package=root
   uv_resolver::resolver::get_dependencies package=root, version=0a0.dev0
        0.005729s   0ms DEBUG uv_resolver::resolver Adding direct dependency: nvidia-curand-cu12*
   uv_resolver::resolver::choose_version package=nvidia-curand-cu12
     uv_resolver::resolver::package_wait package_name=nvidia-curand-cu12
 uv_resolver::resolver::process_request request=Versions nvidia-curand-cu12
   uv_client::registry_client::simple_api package=nvidia-curand-cu12
     uv_client::cached_client::get_cacheable 
       uv_client::cached_client::read_and_parse_cache file=/home/kbailey/.cache/uv/simple-v1/ede79a3b4249d4c4/nvidia-curand-cu12.rkyv
 uv_resolver::resolver::process_request request=Prefetch nvidia-curand-cu12 *
          0.006496s   0ms DEBUG uv_client::cached_client Found stale response for: https://nexus.example.com/repository/pypi-group/simple/nvidia-curand-cu12/
          0.006518s   0ms DEBUG uv_client::cached_client Sending revalidation request for: https://nexus.example.com/repository/pypi-group/simple/nvidia-curand-cu12/
       uv_client::cached_client::revalidation_request url="https://nexus.example.com/repository/pypi-group/simple/nvidia-curand-cu12/"
          0.179262s 173ms DEBUG uv_client::cached_client Found modified response for: https://nexus.example.com/repository/pypi-group/simple/nvidia-curand-cu12/
       uv_client::cached_client::new_cache file=/home/kbailey/.cache/uv/simple-v1/ede79a3b4249d4c4/nvidia-curand-cu12.rkyv
       uv_client::registry_client::parse_simple_api package=nvidia-curand-cu12
error: Missing `Content-Type` header for https://****:****@nexus.example.com/repository/pypi-group/simple/nvidia-curand-cu12/

kendallbailey avatar Feb 20 '24 23:02 kendallbailey

@kendallbailey -- I deleted your first comment as I believe it unintentionally included sensitive credentials.

charliermarsh avatar Feb 20 '24 23:02 charliermarsh

Testing with uv 0.1.7 has a new behavior. With 0.1.5, so long as the cache is deleted, uv pip install works in all my tests. Now with uv 0.1.7, even with an empty cache it will fail but with an HTTP 401 error rather than a Content-Type missing error.

With uv 0.1.7:

$ /bin/rm -rf ~/.cache/uv && uv pip install project-meta==1.0
error: Failed to download: project-meta==1.0
  Caused by: HTTP status client error (401 Unauthorized) for url (https://nexus.example.com/repository/pypi-group/packages/project-meta/1.0/project_meta-1.0-py3-none-any.whl#sha256=7eca5b23b722a6c5f912e8061d9bf15179be36ea5abf8c0d51ae2724e789dc88)

Same thing with uv 0.1.5:

$ /bin/rm -rf ~/.cache/uv && uv pip install project-meta==1.0
Resolved 48 packages in 59.61s
Downloaded 1 package in 38ms
Installed 1 package in 1ms
 + project-meta==1.0

UV_INDEX_URL was set the same in both cases.

kendallbailey avatar Feb 22 '24 15:02 kendallbailey

Hello, I have the same exact behavior with a Nexus pypi proxy in my company.

Djailla avatar Feb 25 '24 20:02 Djailla

It works using --no-cache but uv loose a lot of value :(

Djailla avatar Feb 25 '24 20:02 Djailla

You can close this issue, thanks @charliermarsh

Djailla avatar Feb 26 '24 22:02 Djailla

I still see the issue arise in uv 0.1.11

kendallbailey avatar Feb 26 '24 22:02 kendallbailey

Something very specific to Nexus so we’ll have to set up a repo to have any chance of debugging it.

charliermarsh avatar Feb 26 '24 23:02 charliermarsh

I set up a Nexus repo myself but unfortunately everything is working fine locally.

charliermarsh avatar Feb 28 '24 05:02 charliermarsh

If that helps ... I have the issue too with a Nexus repo, latest uv version. --no-cache fixes the issue (but is slower). curl shows that a Content-Type header is provided

mbrulatout avatar Feb 28 '24 18:02 mbrulatout

I'm starting to suspect uv is dropping the credentials from the index URL when doing a "revalidation_request". All of the URLs being logged lack the credentials but the error message includes them. If I curl the URL in the verbose output the response indeed has no content-type header, but also has a 401 response code. If I curl the URL in the error message, then the response has a 200 response code and includes a content-type header. @charliermarsh, when setting up Nexus did you configure it to require authentication?

tail end of uv verbose output:

 uv_client::cached_client::from_path_sync path="/home/kbailey/.cache/uv/simple-v3/ede79a3b4249d4c4/setuptools.rkyv"
              0.044396s   1ms DEBUG uv_client::cached_client Found stale response for: https://nexus.example.com/repository/pypi-group/simple/setuptools/
              0.044423s   1ms DEBUG uv_client::cached_client Sending revalidation request for: https://nexus.example.com/repository/pypi-group/simple/setuptools/
           uv_client::cached_client::revalidation_request url="https://nexus.example.com/repository/pypi-group/simple/setuptools/"
              0.538425s 495ms DEBUG uv_client::cached_client Found modified response for: https://nexus.example.com/repository/pypi-group/simple/setuptools/
           uv_client::cached_client::new_cache file=/home/kbailey/.cache/uv/simple-v3/ede79a3b4249d4c4/setuptools.rkyv
           uv_client::registry_client::parse_simple_api package=setuptools
error: Failed to build editables
  Caused by: Failed to build editable: file:///home/kbailey/work/example-uv
  Caused by: Failed to install requirements from build-system.requires (resolve)
  Caused by: No solution found when resolving: setuptools >=61.0
  Caused by: Missing `Content-Type` header for https://nexus_user:[email protected]/repository/pypi-group/simple/setuptools/

I replaced the real credentials with nexus_user:nexus_password above

I've seen 401 errors in other contexts. By the way, I managed to use rust-gdb to set a breakpoint where the Content-Type error is triggered but didn't have success debugging beyond that.

kendallbailey avatar Mar 04 '24 15:03 kendallbailey

This could be right, will take a look at this hypothesis, although I'm surprised that it hasn't affected other kinds of indexes. Thanks @kendallbailey.

charliermarsh avatar Mar 04 '24 15:03 charliermarsh

I don't think we could be receiving a 401, because we call error_for_status on all of these routes. IIUC, typically the authentication gets moved from the URL into the headers, which is why it's not present in the verbose output.

charliermarsh avatar Mar 04 '24 15:03 charliermarsh

Setting a breakpoint here I can see that the HTTP response that triggers the Content-Type error has a 304 status.

(gdb) p response.res.head.status
$3 = http::status::StatusCode (core::num::nonzero::NonZeroU16 (304))

kendallbailey avatar Mar 04 '24 20:03 kendallbailey

Any idea what Sonatype Nexus version you're on?

charliermarsh avatar Mar 05 '24 19:03 charliermarsh

I'm able to sort of reproduce it by forcing some error paths manually.

charliermarsh avatar Mar 05 '24 20:03 charliermarsh

If I put up a branch with some additional logic and logging, would anyone here be able to test against their Nexus repos?

charliermarsh avatar Mar 05 '24 20:03 charliermarsh

Anyway, draft PR is here: https://github.com/astral-sh/uv/pull/2218

charliermarsh avatar Mar 05 '24 20:03 charliermarsh

If someone can run against that branch with RUST_LOG=trace cargo run pip install --verbose ... that would be much appreciated.

charliermarsh avatar Mar 05 '24 21:03 charliermarsh

Thank you for working on this issue. The Nexus version I'm using is OSS 3.48.0-01. I ran the PR branch on a project with many dependencies and it didn't fail. The trace output is huge. I did some other experiments, all succeeded where the stable uv hit the Content-Type error. Is there something you'd like me to extract from the trace output?

kendallbailey avatar Mar 06 '24 19:03 kendallbailey

Thanks! I'm mostly interested in lines that start with:

  • new_policy:
  • self.response:
  • is modified because status
  • not modified because

If you could send me one grouping of those, it would help us understand why we're not respecting the server's 304. (Although I think the fix I have in there is still correct; it'll just be slower for Nexus, and I'd like to get caching working.)

charliermarsh avatar Mar 06 '24 19:03 charliermarsh

I didn't find the last two text. Here's one block with the first two when installing pandas.

          0.634125s 565ms DEBUG uv_client::httpcache checking if cached response is modified
          0.634190s 565ms DEBUG uv_client::httpcache new_policy: CachePolicy { config: CacheConfig { shared: false, heuristic_percent: 10 }, request: Request { uri: "https://nexus.example.com/repository/pypi-group/simple/pandas/", method: Get, headers: RequestHeaders { cc: CacheControl { max_age_seconds: None, no_cache: false, no_store: false, no_transform: false, max_stale_seconds: None, min_fresh_seconds: None, only_if_cached: false, must_revalidate: false, must_understand: false, private: false, proxy_revalidate: false, public: false, s_maxage_seconds: None, immutable: false }, authorization: true }, unix_timestamp: 1709753888 }, response: Response { status: 304, headers: ResponseHeaders { cc: CacheControl { max_age_seconds: None, no_cache: false, no_store: false, no_transform: false, max_stale_seconds: None, min_fresh_seconds: None, only_if_cached: false, must_revalidate: false, must_understand: false, private: false, proxy_revalidate: false, public: false, s_maxage_seconds: None, immutable: false }, age_seconds: None, date_unix_timestamp: Some(1709753888), expires_unix_timestamp: None, last_modified_unix_timestamp: None, etag: None }, unix_timestamp: 1709753888 }, vary: Vary { fields: [] } }
          0.634281s 565ms DEBUG uv_client::httpcache self.response: ArchivedResponse { status: 200, headers: ArchivedResponseHeaders { cc: ArchivedCacheControl { max_age_seconds: None, no_cache: false, no_store: false, no_transform: false, max_stale_seconds: None, min_fresh_seconds: None, only_if_cached: false, must_revalidate: false, must_understand: false, private: false, proxy_revalidate: false, public: false, s_maxage_seconds: None, immutable: false }, age_seconds: None, date_unix_timestamp: Some(1709675539), expires_unix_timestamp: None, last_modified_unix_timestamp: Some(1709511660), etag: None }, unix_timestamp: 1709675540 }
          0.634343s 565ms DEBUG uv_client::cached_client Found modified response for: https://nexus.example.com/repository/pypi-group/simple/pandas/
          0.634460s 565ms TRACE uv_client::httpcache cached request https://nexus.example.com/repository/pypi-group/simple/pandas/ is not storable because its response has unsupported status code 304
          0.634555s 565ms DEBUG uv_client::cached_client Server returned invalid 304 for: https://nexus.example.com/repository/pypi-group/simple/pandas/
       uv_client::cached_client::fresh_request url="https://nexus.example.com/repository/pypi-group/simple/pandas/"
            0.635080s   0ms TRACE uv_client::cached_client Sending fresh GET request for https://nexus.example.com/repository/pypi-group/simple/pandas/
            0.635290s   0ms TRACE hyper::client::pool take? ("https", nexus.example.com): expiration = Some(90s)
            0.635346s   0ms DEBUG hyper::client::pool reuse idle connection for ("https", nexus.example.com)

kendallbailey avatar Mar 06 '24 20:03 kendallbailey

That's great, thanks! We'll look into it. I appreciate your help.

charliermarsh avatar Mar 06 '24 20:03 charliermarsh

Awesome, will test it as soon it is released !

Djailla avatar Mar 06 '24 20:03 Djailla

Okay, so it looks like we're failing to use the cache because:

  • new_policy.response.headers.last_modified_unix_timestamp is None...
  • But self.response.headers.last_modified_unix_timestamp is Some(1709511660).

So the server isn't returning a last_modified_unix_timestamp. My read is that technically we're not supposed to use the cached value here, based on https://www.rfc-editor.org/rfc/rfc9111.html#section-4.3.4?

\cc @BurntSushi

charliermarsh avatar Mar 06 '24 21:03 charliermarsh

@BurntSushi - Is it possible there's something we're not sending up with our request, that's causing the server not to send its last-modified timestamp?

charliermarsh avatar Mar 06 '24 21:03 charliermarsh

I merged the fallback behavior, which should allow using Nexus with only a slight performance hit. (This only affects fetching metadata, which isn't that expensive anyway.) I'll open a separate issue to understand the root cause for Nexus (i.e., why it's not returning a Last-Modified).

charliermarsh avatar Mar 06 '24 23:03 charliermarsh