pixi icon indicating copy to clipboard operation
pixi copied to clipboard

Issues when installing pyannote.audio with torchaudio

Open niemiaszek opened this issue 10 months ago • 7 comments

Checks

  • [X] I have checked that this issue has not already been reported.

  • [X] I have confirmed this bug exists on the latest version of pixi, using pixi --version.

Reproducible example


Issue description

First of all, installing pypi packages with "." in name is not possible in pixi.toml ("data did not match any variant of untagged enum PyPiRequirement"), but can be done with pixi add and with - instead of ..

Second thing are the requirements for pyannote.audio. I'm building env with it for CPU only and it's fine when using:

[dependencies]
pytorch = {version="*", channel="pytorch"}
torchvision = {version="*", channel="pytorch"}
torchaudio = {version="*", channel="pytorch"}

However, adding pyannote-audio ends up installing torchaudio additionally from pip, as it is specified in its requirements:

torchaudio                            2.2.2         py311_cpu              5.1 MiB    conda  torchaudio-2.2.2-py311_cpu.tar.bz2
torchaudio                            2.2.2                                12.2 MiB   pypi   torchaudio-2.2.2-cp311-cp311-manylinux1_x86_64.whl

I think only one version of torch is installed, which is nice, but installing this torchaudio from pypi ruins installation, which was fine without pyannote.audio:

import torchaudio
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/p.niemiec/Repos/diarization-poc/.pixi/envs/diarization-demo/lib/python3.11/site-packages/torchaudio/__init__.py", line 2, in <module>
    from . import _extension  # noqa  # usort: skip
    ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/p.niemiec/Repos/diarization-poc/.pixi/envs/diarization-demo/lib/python3.11/site-packages/torchaudio/_extension/__init__.py", line 38, in <module>
    _load_lib("libtorchaudio")
  File "/home/p.niemiec/Repos/diarization-poc/.pixi/envs/diarization-demo/lib/python3.11/site-packages/torchaudio/_extension/utils.py", line 60, in _load_lib
    torch.ops.load_library(path)
  File "/home/p.niemiec/Repos/diarization-poc/.pixi/envs/diarization-demo/lib/python3.11/site-packages/torch/_ops.py", line 933, in load_library
    ctypes.CDLL(path)
  File "/home/p.niemiec/Repos/diarization-poc/.pixi/envs/diarization-demo/lib/python3.11/ctypes/__init__.py", line 376, in __init__
    self._handle = _dlopen(self._name, mode)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: libtorch_cuda.so: cannot open shared object file: No such file or directory

Expected behavior

Only conda CPU version of torchaudio kept in env, working import for torchaudio/pyannote.audio

niemiaszek avatar Apr 18 '24 22:04 niemiaszek

Also general questions:

When I trigger pixi add from the pixi shell -e ..., should my shell get reloaded? I find myself removing .lock and .pixi quite often, but I'm still pixi noob and my envs are quite complex, so it might be a skill issue.

I also tried using channels = ["nvidia", {channel = "pytorch", priority = "-1"}] as in Multiple machines form one project example, but seems like "priority" isn't supported.

niemiaszek avatar Apr 18 '24 23:04 niemiaszek

This issue is proably due to our map only working for conda-forge. @nichmor What do we do with non conda-forge (e.g. pytorch) when we map conda to pypi?

@niemiaszek There was a typo in the documentation the priority is an int so you need to loose the ". Fix on its way: https://github.com/prefix-dev/pixi/pull/1234

pixi add should on bash and zsh trigger a reload of the environment. If you don't trust it. Just exit and pixi shell again.

What errors make you remove those files?

ruben-arts avatar Apr 19 '24 07:04 ruben-arts

This issue is proably due to our map only working for conda-forge. @nichmor What do we do with non conda-forge (e.g. pytorch) when we map conda to pypi?

hey ! on non-conda-forge channels we don't assume that conda-name is the same as pypi-name so this is reason why torchaudio is installed twice.

nichmor avatar Apr 19 '24 07:04 nichmor

Cant we add pytorch to the mapping though?

baszalmstra avatar Apr 19 '24 07:04 baszalmstra

Cant we add pytorch to the mapping though?

yes sure! we can extend it for all channels

nichmor avatar Apr 19 '24 08:04 nichmor

pixi add should on bash and zsh trigger a reload of the environment. If you don't trust it. Just exit and pixi shell again.

What errors make you remove those files?

Hard to tell for me right now, as I was doing some changes in a rush, but these were mostly dependency issues. One example I can recall is similar to #1194, where I added dask with pixi add, and then I was getting error importing dask 'pyarrow' has no attribute '__version__'. Most errors like these were coming from pyarrow being installed both from conda and pip, but it's perfectly fine for me now.

I will try to put more attention to this topic and reproduce some examples.

niemiaszek avatar Apr 19 '24 08:04 niemiaszek

Thanks for fast response. ML frameworks usually make life hard, but I'm quite impressed how easy to setup my envs are. This mapping thing is indeed important.

CUDA-related libs are also a bit of edge case. I'm quite amazed that there is still no common CUDA target. My dream would be a possibility to easily setup major frameworks [torch, tensorflow, jax, mlx] with GPU/CPU support. This would require also handling cases as installing pip wheels for cuda with TF and conda packages from nvidia channel for Pytorch, ultimately ending with one CUDA installation

I mentioned #261 with usage of Keras 3, which would be a fun example to play with, as Keras allows to use same codebase, switching only used backend (supports all major frameworks with MLX support on a way). I think this would be The Ultimate Benchmark, covering most user scenarios. I will fiddle with it a bit and try to set up one env with 3 frameworks, testing pip and conda combinations.

niemiaszek avatar Apr 19 '24 09:04 niemiaszek

@nichmor I've seen you are busy with other tasks, but I have a question related to this issue. I tried solving this env again on 0.23 and torchaudio got installed again with both conda and pypi version. Is there currently any workaround for this, like some manual mapping?

niemiaszek avatar May 28 '24 09:05 niemiaszek

@nichmor I've seen you are busy with other tasks, but I have a question related to this issue. I tried to solving this env again on 0.23 and torchaudio got installed again with both conda and pypi version. Is there currently any workaround for this, like some manual mapping?

Hey @niemiaszek ! Let me see what is the problem for it and comeback with a solution for this

nichmor avatar May 28 '24 11:05 nichmor

hey @niemiaszek ! You can define a custom mapping under project:

conda-pypi-map = { "pytorch" = "local_mapping.json" }

[tasks]

[dependencies]
captum = {version="*", channel="pytorch"}
boltons = {version = "*"}

[pypi-dependencies]
captum = { version = "*"}

and inside of it you can have : {"captum": "captum"} this means that it will map captum to captum. You can also use this: {"captum": null}, so in this case conda's captum will be not mapped.

Let me know if it helps you or you have any questions

nichmor avatar May 28 '24 12:05 nichmor

@nichmor I think I kinda get how this should work, but after making local mapping with {"torchaudio":"torchaudio"}, torchaudio was added correctly I think (only conda cpu version in pixi list), but other Pytorch packags now didn't get mapped. I assume I should concatenate my local map with the regular map that is already used for Pytorch. It mapped previously pypi "torch" from pyannote.audio requirements to condas "pytorch", that I specified in the pixi.toml, but installed torchaudio from both conda (desired) and pypi (undesired).

Can I find the regular mapping somewhere so I can just add one record for torchaudio, that won't overwrite current mappings?

niemiaszek avatar May 28 '24 15:05 niemiaszek

I also noticed some interesting warnings from CLI while doing so.

  1. After I added conda-pypi-mapping I got following warning: WARN pixi::project::manifest: Defined custom mapping channel https://conda.anaconda.org/pytorch/ is missing from project channels Please note that I don't use channel pytorch in default env, so that might be cause.

  2. When going with pixi shell into my env with custom mapping: WARN pixi::install_pypi: These conda-packages will be overridden by pypi: pytorch

This would align with the fact that both "pytorch" got installed from conda and "torch" with all requirements got installed from pypi in this env.

  1. After I removed custom mapping and setup my env again: WARN pixi::install_pypi: These conda-packages will be overridden by pypi: torchaudio Both torchaudios got installed.

niemiaszek avatar May 28 '24 16:05 niemiaszek

One strange thing is that after I passed {"torchaudio":"torchaudio", "pytorch": "torch"} as my local mapping, I got correct output of pixi list (torchaudio only from conda, same for pytorch), but I couldn't access packages from pytorch channel at all in python. Both import torch and import torchaudio didn`t find module.

I think I'm doing something wrong here

niemiaszek avatar May 28 '24 16:05 niemiaszek

hey @niemiaszek ! Sorry that I wasn't very explicit about mapping and how it works. If you define a custom mapping for a specific channel, pixi will not request our own mapping for it anymore. You can use this one as a starter and add there torchaudio. Please note that this mapping contains packages only from conda-forge channel, not pytorch.

After merging our mapping and adding there torchaudio I think your issue with importing should be fixed. ( after adding a new mapping, please remove lock file. we currently don't invalidate lock file if mapping changed ) Let me know if it helps.

nichmor avatar May 29 '24 07:05 nichmor

@nichmor thanks for help! Sorry for late response, but I was busy with other stuff this week :face_in_clouds:

Everything works perfect now. I expected that some extension for mapping was needed.

I was just confused how pytorch and torchvision get installed correctly, even tho they are also from pytorch channel. However, these packages are included in default mapping and torchaudio was just the one left behind.

Do you think there will be a way to make it work out of the box? I've seen you already started considering easier way to patch mapping

niemiaszek avatar Jun 06 '24 09:06 niemiaszek