rapids_singlecell icon indicating copy to clipboard operation
rapids_singlecell copied to clipboard

[QST] rsc.tl.umap returns RAFT failure

Open johnsCheng opened this issue 1 year ago • 6 comments

What is your question? I can not run rsc.tl.umap, though I can run all steps before running UMAP dimension reduction. I made the installation of RSC using Conda create rapids-24.10 and pip install rapids-singlecell (version 0.10.10). I'm not sure whether the error came from RSC or RAPIDS. Here is the error report: `--------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) Cell In[104], line 1 ----> 1 rsc.tl.umap(fibro, min_dist=0.3) #min_dist float (default: 0.5)

File ~/.conda/envs/rapids-24.10/lib/python3.10/site-packages/rapids_singlecell/tools/_umap.py:182, in umap(adata, min_dist, spread, n_components, maxiter, alpha, negative_sample_rate, init_pos, random_state, a, b, key_added, neighbors_key, copy) 163 umap = UMAP( 164 n_neighbors=n_neighbors, 165 n_components=n_components, (...) 178 precomputed_knn=pre_knn, 179 ) 181 key_obsm, key_uns = ("X_umap", "umap") if key_added is None else [key_added] * 2 --> 182 adata.obsm[key_obsm] = umap.fit_transform(X) 184 adata.uns[key_uns] = {"params": stored_params} 185 return adata if copy else None

File ~/.conda/envs/rapids-24.10/lib/python3.10/site-packages/cuml/internals/api_decorators.py:188, in _make_decorator_function..decorator_function..decorator_closure..wrapper(*args, **kwargs) 185 set_api_output_dtype(output_dtype) 187 if process_return: --> 188 ret = func(*args, **kwargs) 189 else: 190 return func(*args, **kwargs)

File ~/.conda/envs/rapids-24.10/lib/python3.10/site-packages/cuml/internals/api_decorators.py:393, in enable_device_interop..dispatch(self, *args, **kwargs) 391 if hasattr(self, "dispatch_func"): 392 func_name = gpu_func.name --> 393 return self.dispatch_func(func_name, gpu_func, *args, **kwargs) 394 else: 395 return gpu_func(self, *args, **kwargs)

File ~/.conda/envs/rapids-24.10/lib/python3.10/site-packages/cuml/internals/api_decorators.py:190, in _make_decorator_function..decorator_function..decorator_closure..wrapper(*args, **kwargs) 188 ret = func(*args, **kwargs) 189 else: --> 190 return func(*args, **kwargs) 192 return cm.process_return(ret)

File base.pyx:687, in cuml.internals.base.UniversalBase.dispatch_func()

File umap.pyx:741, in cuml.manifold.umap.UMAP.fit_transform()

File ~/.conda/envs/rapids-24.10/lib/python3.10/site-packages/cuml/internals/api_decorators.py:188, in _make_decorator_function..decorator_function..decorator_closure..wrapper(*args, **kwargs) 185 set_api_output_dtype(output_dtype) 187 if process_return: --> 188 ret = func(*args, **kwargs) 189 else: 190 return func(*args, **kwargs)

File ~/.conda/envs/rapids-24.10/lib/python3.10/site-packages/cuml/internals/api_decorators.py:393, in enable_device_interop..dispatch(self, *args, **kwargs) 391 if hasattr(self, "dispatch_func"): 392 func_name = gpu_func.name --> 393 return self.dispatch_func(func_name, gpu_func, *args, **kwargs) 394 else: 395 return gpu_func(self, *args, **kwargs)

File ~/.conda/envs/rapids-24.10/lib/python3.10/site-packages/cuml/internals/api_decorators.py:190, in _make_decorator_function..decorator_function..decorator_closure..wrapper(*args, **kwargs) 188 ret = func(*args, **kwargs) 189 else: --> 190 return func(*args, **kwargs) 192 return cm.process_return(ret)

File base.pyx:687, in cuml.internals.base.UniversalBase.dispatch_func()

File umap.pyx:678, in cuml.manifold.umap.UMAP.fit()

RuntimeError: RAFT failure at file=~/.conda/envs/rapids-24.10/include/raft/spectral/detail/lapack.hpp line=490: `

johnsCheng avatar Nov 21 '24 10:11 johnsCheng

@johnsCheng

I tried to reproduce you issue but I cant do it. Do you have a minimal reproducer. I also think there might be an issue with your installation. Can you please confirm all versions of rapids and rapids-singlecell. Also please try rapids-singlecell== 0.10.11

Intron7 avatar Nov 21 '24 13:11 Intron7

Thanks for your advice! I made the installation of the new version rapids-singlecell (0.10.11) now

$conda list rapids
# packages in environment at /share/home/jinghuic/software/miniconda3/envs/rapids_singlecell:
#
# Name                    Version                   Build  Channel
rapids                    24.10.00        cuda12_py311_241009_g19a0c5a_0    rapidsai
rapids-dask-dependency    24.10.00                   py_0    rapidsai
rapids-singlecell         0.10.11                  pypi_0    pypi
rapids-xgboost            24.10.00        cuda12_py311_241009_g19a0c5a_0    rapidsai

But here I'm stuck with the first step rsc.get.anndata_to_GPU which returns the error:

rapids_singlecell version is 0.10.11
(159682, 27157)
0.0 8.974227131854372
(159682, 27157)
0.0 8.974227131854372
0
<class 'anndata._core.views.SparseCSRMatrixView'>
TypeError: float() argument must be a string or a real number, not 'csr_matrix'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/share/home/jinghuic/script/read.py", line 38, in <module>
    rsc.get.anndata_to_GPU(batch)
  File "/share/home/jinghuic/software/miniconda3/envs/rapids_singlecell/lib/python3.11/site-packages/rapids_singlecell/get/_anndata.py", line 63, in anndata_to_GPU
    _set_obs_rep(adata, X, layer=layer)
  File "/share/home/jinghuic/software/miniconda3/envs/rapids_singlecell/lib/python3.11/site-packages/scanpy/get/get.py", line 471, in _set_obs_rep
    adata.X = val
    ^^^^^^^
  File "/share/home/jinghuic/software/miniconda3/envs/rapids_singlecell/lib/python3.11/site-packages/anndata/_core/anndata.py", line 650, in X
    self._adata_ref._X[oidx, vidx] = value
    ~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^
  File "/share/home/jinghuic/software/miniconda3/envs/rapids_singlecell/lib/python3.11/site-packages/scipy/sparse/_csr.py", line 41, in __setitem__
    return super().__setitem__(key, value)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/share/home/jinghuic/software/miniconda3/envs/rapids_singlecell/lib/python3.11/site-packages/scipy/sparse/_index.py", line 145, in __setitem__
    x = np.asarray(x, dtype=self.dtype)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
**ValueError: setting an array element with a sequence.**

I have no idea, and following is my script:

  1 import scanpy as sc
  2 import rapids_singlecell as rsc
  3 print('rapids_singlecell version is',rsc.__version__)
  4 import numpy as np
  5 read_file='python/Wang_NatCancer.h5ad'
  6 mydata=sc.read(read_file)
  7 print(mydata.X.shape)
  8 print(np.min(mydata.X),np.max(mydata.X))
  9 import cupy as cp
 10 
 11 import time
 12 
 13 import warnings
 14 
 15 warnings.filterwarnings("ignore")
 16 import rmm
 17 from rmm.allocators.cupy import rmm_cupy_allocator
 18 
 19 rmm.reinitialize(
 20     managed_memory=True,  # Allows oversubscription
 21     pool_allocator=False,  # default is False
 22     devices=3,  # GPU device IDs to register. By default registers only GPU 0.
 23 )
 24 cp.cuda.set_allocator(rmm_cupy_allocator)
 25 import anndata as ad
 26 print(mydata.X.shape)
 27 print(np.min(mydata.X),np.max(mydata.X))
 28 import gc
 29 import scipy.sparse as sp
 30 # Handle memory error, e.g., by reducing data size or using CPU as fallback
 31 # Example: Process data in smaller batches
 32 batch_size = 1000  # Adjust batch size as needed
 33 for i in range(0, len(mydata), batch_size):
 34     batch = mydata[i:i+batch_size]
 35     print(i)
 36     try:
 37         print(type(batch.X))
 38         rsc.get.anndata_to_GPU(batch)
 39     except MemoryError as e:
 40         print("MemoryError SMALL in batch "+str(i) , e)
 41         # Handle batch-specific memory error
 42     finally:
 43         del batch
 44         gc.collect()
 45 
 46 # Ensure GPU memory is freed up
 47 cp.get_default_memory_pool().free_all_blocks()
 48 
 49 
 50 #rsc.get.anndata_to_GPU(mydata)

johnsCheng avatar Nov 27 '24 07:11 johnsCheng

@Intron7 PS, I also check the related issue in #261 , I have no error from the installation.

$nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Thu_Mar_28_02:18:24_PDT_2024
Cuda compilation tools, release 12.4, V12.4.131
Build cuda_12.4.r12.4/compiler.34097967_0

$nvidia-smi
Wed Nov 27 16:08:54 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.129.03             Driver Version: 535.129.03   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA L40                     On  | 00000000:17:00.0 Off |                    0 |
| N/A   31C    P8              34W / 300W |      4MiB / 46068MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   1  NVIDIA L40                     On  | 00000000:65:00.0 Off |                    0 |
| N/A   30C    P8              37W / 300W |      4MiB / 46068MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   2  NVIDIA L40                     On  | 00000000:CA:00.0 Off |                    0 |
| N/A   31C    P8              34W / 300W |      4MiB / 46068MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   3  NVIDIA L40                     On  | 00000000:E3:00.0 Off |                    0 |
| N/A   32C    P8              33W / 300W |      4MiB / 46068MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

And the errors TypeError are not the same, which in my case it seems like a mathematical function error **ValueError: setting an array element with a sequence.**

Any advice would be helpful, please.

johnsCheng avatar Nov 27 '24 08:11 johnsCheng

can you check what is the dtype from X.data?

Intron7 avatar Nov 27 '24 08:11 Intron7

Yeah, I have checked in line 37

 20 print(mydata.X.shape)
 21 print(np.min(mydata.X),np.max(mydata.X))
 22 print(mydata.X.dtype)
 23 import gc
 24 import scipy.sparse as sp
 25 # Handle memory error, e.g., by reducing data size or using CPU as fallback
 26 # Example: Process data in smaller batches
 27 batch_size = 1000  # Adjust batch size as needed
 28 for i in range(0, len(mydata), batch_size):
 29     batch = mydata[i:i+batch_size]
 30     print(i)
 31     try:
 32         print(type(batch.X))
 33         print(batch.X.dtype)
 34         rsc.get.anndata_to_GPU(batch)

It's the <class 'anndata._core.views.SparseCSRMatrixView'>, as same as the normal AnnData.

rapids_singlecell version is 0.10.11
(159682, 27157)
0.0 8.974227131854372
float64
0
<class 'anndata._core.views.SparseCSRMatrixView'>
float64

TypeError: float() argument must be a string or a real number, not 'csr_matrix'

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/share/home/jinghuic/script/read.py", line 38, in rsc.get.anndata_to_GPU(batch) File "/share/home/jinghuic/software/miniconda3/envs/rapids_singlecell/lib/python3.11/site-packages/rapids_singlecell/get/_anndata.py", line 63, in anndata_to_GPU _set_obs_rep(adata, X, layer=layer) File "/share/home/jinghuic/software/miniconda3/envs/rapids_singlecell/lib/python3.11/site-packages/scanpy/get/get.py", line 471, in _set_obs_rep adata.X = val ^^^^^^^ File "/share/home/jinghuic/software/miniconda3/envs/rapids_singlecell/lib/python3.11/site-packages/anndata/_core/anndata.py", line 650, in X self._adata_ref._X[oidx, vidx] = value ~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^ File "/share/home/jinghuic/software/miniconda3/envs/rapids_singlecell/lib/python3.11/site-packages/scipy/sparse/_csr.py", line 41, in setitem return super().setitem(key, value) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/share/home/jinghuic/software/miniconda3/envs/rapids_singlecell/lib/python3.11/site-packages/scipy/sparse/_index.py", line 145, in setitem x = np.asarray(x, dtype=self.dtype) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ValueError: setting an array element with a sequence.

johnsCheng avatar Nov 28 '24 04:11 johnsCheng

same error when using rsc.get.anndata_to_GPU

TypeError: float() argument must be a string or a real number, not 'csr_matrix'

However, when I save the adata to an .h5ad file and then reload it, everything works fine. I’m not sure what happened.

Donjae-Wang avatar Dec 15 '24 07:12 Donjae-Wang