scGPT icon indicating copy to clipboard operation
scGPT copied to clipboard

scGPT installation problem

Open ghost opened this issue 2 years ago • 23 comments

Hello, I encountered an error while executing pip install scgpt. The error message says "No module named 'torch'", but I have already installed "torch". I'm not sure what could be causing this issue. image

ghost avatar May 27 '23 07:05 ghost

Hi, it looks like an installation issue with flash-attn. I noticed there has been various issues reported about new versions of flash-attn. I recommend using CUDA 11.7 and flash-attn<1.0.5 for now.

Please see the new notice I put in the note here in readme. https://github.com/bowang-lab/scGPT#installation

subercui avatar May 30 '23 16:05 subercui

Hi, it looks like an installation issue with flash-attn. I noticed there has been various issues reported about new versions of flash-attn. I recommend using CUDA 11.7 and flash-attn<1.0.5 for now.

Please see the new notice I put in the note here in readme. https://github.com/bowang-lab/scGPT#installation

Thank you for your reply, I would like to ask if scGPT can be installed and used when the computer or server does not have a GPU?

ghost avatar May 31 '23 15:05 ghost

Hi, technically, it should be able to run on CPU. But currently, due to some legacy choice, you'll need the flash-attn to load pretrained models, and flash-attn only works with GPU. I am considering adding

  1. a pip install option for CPU only version, such as pip install scgpt-cpu
  2. Relax the requirements on flash-attn, and support the loading of pretrained model in a native pytorch implementation on cpu

subercui avatar May 31 '23 16:05 subercui

Hi, technically, it should be able to run on CPU. But currently, due to some legacy choice, you'll need the flash-attn to load pretrained models, and flash-attn only works with GPU. I am considering adding

  1. a pip install option for CPU only version, such as pip install scgpt-cpu
  2. Relax the requirements on flash-attn, and support the loading of pretrained model in a native pytorch implementation on cpu

Thank you very much for your reply, looking forward to your follow-up updates

ghost avatar Jun 01 '23 03:06 ghost

Hi, Congrats and thanks for the great tool. Continuing the discussion, I tried several ways of getting the installation running with flash-attn being a problem (torch not found or some other errors). The flash-attn repo has the same issues and is not always resolved : https://github.com/HazyResearch/flash-attention/issues/246 . I would be really grateful if you could provide us a docker/singularity container which can be used to run the program.

shobhitagrawal1 avatar Jul 05 '23 11:07 shobhitagrawal1

Thank you for the nice suggestion. I will try to set it up next week. In the meantime, are you using flash-attn<1.0.5? If not, I would suggest having a try

subercui avatar Jul 05 '23 17:07 subercui

Hi, the main problem I have with the installation is that my nvidia cuda drivers are 12.1. So if I use a docker container with pytorch for 12.1 and I build flash-attn from source, when I pip install scGPT, it proceeds to downgrade my torch and cuda drivers to 11 and rebuild flash-attn for cuda11. So of course, it won't run with the mismatched drivers. Can the requirements for torch be set to >= instead of == ? Or is there some other workaround? thanks.

killerz99 avatar Jul 05 '23 19:07 killerz99

seems to come from the torchdata package.. it downgrades to torch2.0.1 and then cuda11..

root@d6088c2b9744:/workspace# pip show torch Name: torch Version: 2.1.0a0+fe05266 ... root@d6088c2b9744:/workspace# pip3 install torchdata Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com Collecting torchdata Downloading torchdata-0.6.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.6 MB) |████████████████████████████████| 4.6 MB 3.4 MB/s Requirement already satisfied: requests in /usr/local/lib/python3.8/dist-packages (from torchdata) (2.28.2) Requirement already satisfied: urllib3>=1.25 in /usr/local/lib/python3.8/dist-packages (from torchdata) (1.26.15) Collecting torch==2.0.1 Downloading torch-2.0.1-cp38-cp38-manylinux1_x86_64.whl (619.9 MB) |████████████████████████████████| 619.9 MB 17.2 MB/s Collecting nvidia-nccl-cu11==2.14.3 Downloading nvidia_nccl_cu11-2.14.3-py3-none-manylinux1_x86_64.whl (177.1 MB) |██████████▎ | 56.6 MB 40.0 MB/s eta 0:00:04^C

killerz99 avatar Jul 05 '23 21:07 killerz99

is it possible to swap out for torchdatasets

https://github.com/szymonmaszke/torchdatasets

killerz99 avatar Jul 05 '23 21:07 killerz99

something else also wants torch besides torchdata. So, upgrading it is unlikely to fix it..

#31 17.71 Requirement already satisfied: torchdata==0.6.1 in /usr/local/lib/python3.8/dist-packages (from torchtext>=0.14.0->scgpt==0.1.2.post1) (0.6.1) #31 17.94 Collecting torch>=1.8.0 #31 18.00 Downloading torch-2.0.1-cp38-cp38-manylinux1_x86_64.whl (619.9 MB) #31 44.07 Collecting nvidia-cusparse-cu11==11.7.4.91

killerz99 avatar Jul 05 '23 23:07 killerz99

Thank you for the nice suggestion. I will try to set it up next week. In the meantime, are you using flash-attn<1.0.5? If not, I would suggest having a try

I did flash-attn 1.0.4 ... just running into more problems.

shobhitagrawal1 avatar Jul 10 '23 15:07 shobhitagrawal1

pip install flash-attn==1.0.6 --no-build-isolation

This solved my problems. flash-attn 1.0.6 will not require re-installation of pytorch.

dhoanghiep avatar Jul 18 '23 05:07 dhoanghiep

uninstall all the relative drivers for nvidia GPU ,reinstall them through local run file, make sure the output of nvcc -V matches your cuda version.

taffy-miao avatar Jul 18 '23 07:07 taffy-miao

@shenyeyouyi Thank you. I agree that nvcc -V should tell the current visible cuda version, especially when you have multiple cuda versions installed.

subercui avatar Jul 19 '23 17:07 subercui

In a conda environment I followed the following steps and solved my problems:

mamba install python=3.10.11 cudatoolkit=11.7 cudatoolkit-dev 'gxx>=6.0.0,<12.0' cudnn r-base r-devtools
pip install packaging
pip install flash-attn==1.0.6 --no-build-isolation
pip install scgpt

I hope this helps.

scarpio02 avatar Aug 07 '23 16:08 scarpio02

Hi, @scarpio02. Thanks, but I think you may be omitting details:

pip install flash-attn==1.0.6 --no-build-isolation

results in:

Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com Collecting flash-attn==1.0.6 (...) Preparing metadata (pyproject.toml) ... error error: subprocess-exited-with-error

× Preparing metadata (pyproject.toml) did not run successfully. │ exit code: 1 ╰─> [13 lines of output] Traceback (most recent call last): (...) File "", line 13, in ModuleNotFoundError: No module named 'torch'

Would you mind sharing on what type of instance or container you ran this? If you used an Nvidia PyTorch container, what version did you use? If not, how did you get the drivers and other packages/apps installed, what's the path to nvcc (next error after I install Pytorch), etc.? I can share those details, but that would probably not be useful.

a-dna-n avatar Aug 07 '23 19:08 a-dna-n

Hi, @bzip2. It seems that you didn't install Pytorch? Maybe installing Pytorch-gpu version can solve the problem?

taffy-miao avatar Aug 07 '23 19:08 taffy-miao

Hi, @bzip2 , you don't necessarily need the Nvidia PyTorch container. You do need pytroch, cuda, and a compatible GPU before installing flash-attn. So can you please share those details?

subercui avatar Aug 07 '23 21:08 subercui

@scarpio02 thanks so much, your suggestion pointed me in the right direction! Here is what ultimately worked for me:

My machine that has NVIDIA A100 80GB GPUs, NVIDIA Driver Version: 530.41.03, CUDA Version: 12.1. I'm now able to successfully run the cell type annotation tutorial notebook! Note that I installed scgpt without any of the dependencies (didn't want it to overwrite my manually installed versions of torch and flash-attn), and then I went ahead and installed the dependencies I needed for the cell type annotation tutorial. For other functionalities, you might need to manually install a different subset of the dependencies.

conda create -n scgpt_2
conda activate scgpt_2
conda install python=3.10.11 cudatoolkit=11.7 cudatoolkit-dev 'gxx>=6.0.0,<12.0' cudnn
pip3 install torch torchvision torchaudio
pip install packaging
pip install "flash-attn<1.0.5" --no-build-isolation
conda install r-base r-devtools
pip install --no-deps scgpt
pip install ipykernel
python -m ipykernel install --user --name=scgpt_2
pip install pandas
pip install scanpy
pip install scvi-tools
pip install numba --upgrade
pip install numpy==1.24.4
pip install torchtext
pip install scib
pip install datasets==2.14.5 transformers==4.33.2
pip install wandb

rboiarsky avatar Sep 27 '23 14:09 rboiarsky

after installing drivers on the PC, the above comment worked for me. even with less steps:

pip install packaging torch torchvision torchaudio && pip install "flash-attn<1.0.5" --no-build-isolation && pip install ipykernel pandas scanpy scvi-tools numba --upgrade "numpy<1.24" torchtext scib datasets==2.14.5 transformers==4.33.2 wandb cell-gears torch_geometric && pip install --no-deps scgpt

I needed numpy < 1.24 for working with other packages

jkobject avatar Nov 02 '23 16:11 jkobject

the guide on the readme page should be updated with the suggestion here. I was able to install scGPT with everyone's input here!

yubin-ai avatar Jul 17 '24 16:07 yubin-ai