pixi icon indicating copy to clipboard operation
pixi copied to clipboard

pixi hangs resolving pypi-dependencies

Open dhirschfeld opened this issue 9 months ago • 10 comments
trafficstars

Checks

  • [x] I have checked that this issue has not already been reported.

  • [x] I have confirmed this bug exists on the latest version of pixi, using pixi --version.

Reproducible example

I haven't (yet) created a minimal reproducer.

Issue description

I have a large (analytics) environment made up of conda packages. This resolves very quickly. As soon as I add a pypi-dependencies section pixi hangs when trying to resolve the environment.

It appears to get stuck at the mapping conda to pypi packages and never progresses. I'm currently over 10mins with 0/553 resolved:

❯ pixi shell
⠂ updating lock-file   [00:10:48] [━━━━━━━━━━╾─────────────────────────────]    2/8                                                                                                          
  ⠂ py310:linux-64       [00:10:49] [────────────────────]    0/553  mapping conda to pypi packages
  ⠒ py312:linux-64       [00:10:48] [────────────────────]    0/553  mapping conda to pypi packages 

Expected behavior

The environment resolves and I enter a shell for that environment.

dhirschfeld avatar Feb 21 '25 08:02 dhirschfeld

hey @dhirschfeld ! sorry to hear that you have encountered this situation.

are you running pixi on your local machine? Are these URL's accessible by curls?

https://raw.githubusercontent.com/prefix-dev/parselmouth/main/files/compressed_mapping.json
https://conda-mapping.prefix.dev/hash-v0/115b796fddc846bee6f47e3c57d04d12fa93a47a7a8ef639cefdc05203c1bf00

nichmor avatar Feb 21 '25 08:02 nichmor

It's a new Windows (WSL2) AVD I'm trying to set up. It's pretty locked down so that's a good avenue of investigation...

dhirschfeld avatar Feb 21 '25 09:02 dhirschfeld

NB: I point pixi to our internal Cloudsmith repository for both PyPI and conda

❯ cat /etc/pixi/config.toml
default-channels = ["https://conda.cloudsmith.io/<redacted>"]

[pypi-config]
index-url ="https://dl.cloudsmith.io/<redacted>"

dhirschfeld avatar Feb 21 '25 09:02 dhirschfeld

Cloudsmith supports conda insofar as you can upload and download packages as well as repodata patches so I'm currently manually uploading packages by parsing my pixi.lock file for urls. I then also upload the repodata patch files from the conda-forge-repodata-patches.

It sounds like, for pixi to work our Cloudsmith repository might need to also serve a mapping file?

dhirschfeld avatar Feb 21 '25 09:02 dhirschfeld

The json file downloaded fine, but not the other one:

❯ curl -fsSL -vvv https://conda-mapping.prefix.dev/hash-v0/115b796fddc846bee6f47e3c57d04d12fa93a47a7a8ef639cefdc05203c1bf00
* Host conda-mapping.prefix.dev:443 was resolved.
* IPv6: 2606:4700:20::681a:dbc, 2606:4700:20::ac43:4867, 2606:4700:20::681a:cbc
* IPv4: 104.26.12.188, 104.26.13.188, 172.67.72.103
*   Trying 104.26.12.188:443...
*   Trying [2606:4700:20::681a:dbc]:443...
* Immediate connect fail for 2606:4700:20::681a:dbc: Network is unreachable
*   Trying [2606:4700:20::ac43:4867]:443...
* Immediate connect fail for 2606:4700:20::ac43:4867: Network is unreachable
*   Trying [2606:4700:20::681a:cbc]:443...
* Immediate connect fail for 2606:4700:20::681a:cbc: Network is unreachable
* connect to 104.26.12.188 port 443 from 172.27.88.164 port 58726 failed: Connection timed out
*   Trying 104.26.13.188:443...
* ipv4 connect timeout after 82889ms, move on!
*   Trying 172.67.72.103:443...
* ipv4 connect timeout after 82888ms, move on!
* Failed to connect to conda-mapping.prefix.dev port 443 after 300213 ms: Timeout was reached
* Closing connection
curl: (28) Failed to connect to conda-mapping.prefix.dev port 443 after 300213 ms: Timeout was reached

dhirschfeld avatar Feb 21 '25 09:02 dhirschfeld

It's not impossible to get security to whitelist conda-mapping.prefix.dev but it would be greatly preferable to be able to serve the files from our Cloudsmith repository.

On a less locked-down network I can download the file:

❯ curl -fsSL https://conda-mapping.prefix.dev/hash-v0/115b796fddc846bee6f47e3c57d04d12fa93a47a7a8ef639cefdc05203c1bf00 | jq
{
  "pypi_normalized_names": [
    "requests"
  ],
  "versions": {
    "requests": "2.32.2"
  },
  "conda_name": "requests",
  "package_name": "requests-2.32.2-pyhd8ed1ab_0.conda",
  "direct_url": null
}

In CI I'm able to download packages from conda-forge and push them to our Cloudsmith repo so I assume I could do the same with these files, if I knew the names/urls to download from.

dhirschfeld avatar Feb 21 '25 09:02 dhirschfeld

Is there anyway to download the mapping and point to a cached version? Or is it required to be connected to the internet to download from conda-mapping.prefix.dev?

dhirschfeld avatar Feb 22 '25 01:02 dhirschfeld

Is there anyway to download the mapping and point to a cached version? Or is it required to be connected to the internet to download from conda-mapping.prefix.dev?

yes, it is possible to use a local one ( or maybe one hosted somewhere ) https://github.com/conda/rattler/blob/main/crates/rattler_cache/src/package_cache/mod.rs

you could download one from here: https://github.com/prefix-dev/parselmouth/tree/main/files/v0 based on your channel. so you could set up an automation that will pull it every 10 minutes and self-host it somewhere.

it would be also possible to roll out parselmouth on your infrastructure ( by self-hosting it ), but it will require some adjustments in parselmouth . let me know if you are interested in it.

nichmor avatar Feb 24 '25 09:02 nichmor

I guess at least this line would need to be configurable? https://github.com/prefix-dev/pixi/blob/93a1d1cd830ba4049c52d8738766246fcd88ed3b/crates/pypi_mapping/src/prefix_pypi_name_mapping.rs#L22

I'd be happy downloading the compressed_mapping.json file periodically, I'm just not sure what my webserver should return when the user makes a GET request to /{HASH_DIR}/{hash_str} https://github.com/prefix-dev/pixi/blob/93a1d1cd830ba4049c52d8738766246fcd88ed3b/crates/pypi_mapping/src/prefix_pypi_name_mapping.rs#L42-L48

dhirschfeld avatar Feb 25 '25 01:02 dhirschfeld

I think the most urgent todo is to shorten the await time. In most cases, the poor or blocked (e.g. gfw in China) network won't benefits from such long waiting. Instead, frequent tries make much more sense. I suggest timeout=5s, retries <= 3 times or instant raising error without retry. If it happens, let users know it as soon as possible and do it themselves, like:

until pixi install; do
sleep 3
done

YuanfengZhang avatar Apr 19 '25 15:04 YuanfengZhang