Installation of 0.2.4 and some other minor suggestions
Dear scGPT developers/channel,
I started exploring the wonderful package and would like to share my experiences of installation / issue of CPU usage.
I have been able to install on our CentOS 8 system (with gcc 8.5.0 and Python 9.12, 11) from both a GitHub clone and release -- perhaps I should have used better setup such as Miniconda3 but opted to start simple with Python -m venv. The former works well but the latter appears to have issue with mudata and anndata modlules.
My scripts are available from here, https://cambridge-ceu.github.io/csd3/systems/setup.html#fn:scGPT.
On running several nobtebook in tutorials/, I need to modify the torch.load() calls with an additional argument "map_location=device" as in my case, GPU requires additional application for permission. Since "device" is already defined, it would be easier to have this on. I also need to be careful about location of files since a lot of times it is "../data" but "./reference".
I believe these are minor and possibly a personal taste, so can get away with a pull request.
As others in the list, huge thanks for the great work!
Jing Hua
BTW we have this,
$nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2021 NVIDIA Corporation Built on Wed_Jul_14_19:41:19_PDT_2021 Cuda compilation tools, release 11.4, V11.4.100 Build cuda_11.4.r11.4/compiler.30188945_0
Thanks a lot for sharing the experience. Do you suggest modifying the notebooks to include map_location=device?
Indeed, I have added "map_location=device" in a number of places in the notebooks to get going except "Tutorial_Reference_Mapping.ipynb" which does require manually uncommenting chunks specific to CPUs (though as mentioned probably my case is rare and since this is very important demo I won't mind spending sometime going through). I realised that the tutorials do generate their own data/ and save/ folders besides using those ../data and ../save, etc.
I have seen a number of posts regarding issues of installation, which made me think it might be a good idea to distribute a singularity module to encapsulate the requirements. The data and reference files are not issues since they can be mapped from outside.
Thanks a lot for sharing the experience. Do you suggest modifying the notebooks to include
map_location=device?
could you create an apptainer image or docker image about the package environment, the installation guide is not enough for installation. all kinds of problem occur when running the tutorial programs.
I am not actively working on the type of data right now, but will surely consider this later if necessary.