OpenChatKit
OpenChatKit copied to clipboard
can not create conda environment
Describe the bug Followed the instructions but could not get
conda env create -f environment.yml
to work because of
ResolvePackageNotFound:
- cudatoolkit=11.6.0
- faiss-gpu=1.7.2
- nccl=2.12.12.1
- cupy=10.4.0
To Reproduce Steps to reproduce the behavior: Intall miniconda run conda env create -f environment.yml
Expected behavior Create an environment called OpenChatKit but can't create
Screenshots If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information): Mac
Smartphone (please complete the following information):
- Device: [e.g. iPhone6]
- OS: [e.g. iOS8.1]
- Browser [e.g. stock browser, safari]
- Version [e.g. 22]
Additional context Add any other context about the problem here.
i encountered a similar issue:
ResolvePackageNotFound:
- nccl=2.12.12.1
Any help would be greatly appreciated.
I'm on a Windows machine. Is this package not supported on Windows??!!!
same issue. I'm on ubuntu.
same issue. I'm on ubuntu.
ugh, this sucks.
wish the requirements n instructions are clearer.
我遇到了类似的问题:
ResolvePackageNotFound: - nccl=2.12.12.1任何帮助将不胜感激。
我在一台Windows机器上。此程序包在 Windows 上不受支持吗??!!!
https://github.com/togethercomputer/OpenChatKit/issues/19#issuecomment-1465120380
same issue on mac
I removed this line first and then did the installation, and then executed
conda install -c conda-forge nccl
any update? I am still having this issue. Collecting package metadata (repodata.json): done Solving environment: failed
ResolvePackageNotFound:
- cupy=10.4.0
- faiss-gpu=1.7.2
- nccl=2.12.12.1
- cudatoolkit=11.6.0 I checked cupy, particularly in https://docs.cupy.dev/en/stable/install.html, where the version is not existed.
| CUDA | Command |
|---|---|
| v10.2 (x86_64) | pip install cupy-cuda102 |
| v10.2 (aarch64 - JetPack 4) | pip install cupy-cuda102 -f https://pip.cupy.dev/aarch64 |
| v11.0 (x86_64) | pip install cupy-cuda110 |
| v11.1 (x86_64) | pip install cupy-cuda111 |
| v11.2 ~ 11.8 (x86_64) | pip install cupy-cuda11x |
| v11.2 ~ 11.8 (aarch64 - JetPack 5 / Arm SBSA) | pip install cupy-cuda11x -f https://pip.cupy.dev/aarch64 |
| v12.x (x86_64) | pip install cupy-cuda12x |
| v12.x (aarch64 - JetPack 5 / Arm SBSA) | pip install cupy-cuda12x -f https://pip.cupy.dev/aarch64 |
CUDA
Command
v10.2 (x86_64)
pip install cupy-cuda102
v10.2 (aarch64 - JetPack 4)
pip install cupy-cuda102 -f https://pip.cupy.dev/aarch64
v11.0 (x86_64)
pip install cupy-cuda110
v11.1 (x86_64)
pip install cupy-cuda111
v11.2 ~ 11.8 (x86_64)
pip install cupy-cuda11x
v11.2 ~ 11.8 (aarch64 - JetPack 5 / Arm SBSA)
pip install cupy-cuda11x -f https://pip.cupy.dev/aarch64
v12.x (x86_64)
pip install cupy-cuda12x
v12.x (aarch64 - JetPack 5 / Arm SBSA)
pip install cupy-cuda12x -f https://pip.cupy.dev/aarch64
I am getting the same issue with Mamba.
conda install mamba -n base -c conda-forge
mamba env create -f environment.yml -n OpenChatKit-Test
Getting the following errors:
Could not solve for environment specs
Encountered problems while solving:
- nothing provides requested cudatoolkit 11.6.0**
- nothing provides requested cupy 10.4.0**
- nothing provides requested faiss-gpu 1.7.2**
- nothing provides requested nccl 2.12.12.1**
- nothing provides cuda 11.6.* needed by pytorch-cuda-11.6-h867d48c_0
The environment can't be solved, aborting the operation
Running on macOS 12.16.3
Would be nice if the README can add the prerequisites for setting up the environment.
Would be nice if the README can add the prerequisites for setting up the environment.
I'll update the README. I believe these packages are only available on Linux. Windows users might be able to use WSL (see issue #19), but I don't think this will run on macOS.
this helped on ubuntu: conda config --set channel_priority false
For anyone trying to run inference on a Mac (fyi the training scripts will not work):
This environment.yml worked for me. Since Macs don’t have a CUDA device, you’re going to have to use CPU packages. There is a way to leverage GPU acceleration with MPS but I haven’t tried that yet.
For inference, you’d have to modify the Python script to use CPU. I’m going to put up a PR soon, but for now, reference this bot.py.
Changes:
- Remove
nccl(only works on Linux). Note, you won’t be able to use training scripts becausencclonly works on Linux. (Maybe LoRa will still work?) - add Rust to the conda dependency list because it is needed to build the pip wheel for transformers on Mac.
- specify versions for
numpyandpillow faiss-gputofaiss-cpu- remove
pytorch-cuda,cupy,cudatoolkit(cuda dependency)
environment.yml for Mac:
name: OpenChatKit
channels:
- pytorch
- nvidia
- conda-forge
- defaults
dependencies:
- faiss-cpu=1.7.4
- fastparquet=0.5.0
- pip=22.3.1
- pyarrow=8.0.0
- python=3.10.9
- python-snappy=0.6.1
- pytorch=1.13.1
- snappy=1.1.9
- torchaudio=0.13.1
- torchvision=0.14.1
- rust=1.69.0
- pip:
- accelerate==0.17.1
- datasets==2.10.1
- loguru==0.6.0
- netifaces==0.11.0
- transformers==4.27.4
- wandb==0.13.10
- zstandard==0.20.0
- numpy==1.24.3
- pillow==9.5.0
managed to run, thank to @orangetin But when write any command, end up with:
RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'
Mac M1.
managed to run, thank to @orangetin But when write any command, end up with:
RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'Mac M1.
@EdgBuc make sure the dtype is not float16. Set it to either bfloat16 or float32. (CPU does not support float16). Reference this line.
Indeed,
removing if/else and leaving only: torch_dtype = torch.bfloat16 made a trick.
BTW, it is super slow (1 word per min ) in answering :)
Indeed, removing if/else and leaving only:
torch_dtype = torch.bfloat16made a trick. BTW, it is super slow (1 word per min ) in answering :)
You don't need to modify the if/else statement if you pass --no-gpu and -r as args. See this for more info.
Yeah, that is expected if you're running this on CPU. However, on Silicon, you should be able to set the device to mps and it should use acceleration. That is, change cpu in this line to mps and it should do the trick :)
Indeed, removing if/else and leaving only:
torch_dtype = torch.bfloat16made a trick. BTW, it is super slow (1 word per min ) in answering :)You don't need to modify the if/else statement if you pass
--no-gpuand-ras args. See this for more info.Yeah, that is expected if you're running this on CPU. However, on Silicon, you should be able to set the device to
mpsand it should use acceleration. That is, changecpuin this line tompsand it should do the trick :)
well when tired to change to mps,
got:
The operator 'aten::repeat_interleave.self_int' is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at /Users/runner/work/_temp/anaconda/conda-bld/pytorch_1670525498485/work/aten/src/ATen/mps/MPSFallback.mm:11.)
input_ids = input_ids.repeat_interleave(expand_size, dim=0)
and then:
RuntimeError: Currently topk on mps works only for k<=16
I can't reproduce this error on an M2 (following the instructions I provided). This seems like a PyTorch error.
The only modification to the code was to change cpu to mps like I described earlier. Here's the command I ran:
python3 inference/bot.py --model togethercomputer/RedPajama-INCITE-Base-3B-v1 --no-gpu -r 16
Getting better speeds but still not great. Read this blogpost if you want faster CPU inference