scrublet Scrublet runs into an endless loop

Scrublet runs into an endless loop

Open aditinarsale opened this issue 1 year ago • 0 comments

Hi,

I am really struggling to try and run scrublet on my h5 files from some 10X genomics data. I am a novice at any sort of coding and in all possibility this is because some very obvious mistake on my part and I would really appreciate any help.

I am running scrub let on python 3.10 and I am loading h5 files as the input files. After loading the h5 file I get the following AnnData object:

AnnData object with n_obs × n_vars = 5542 × 36601 var: 'gene_ids', 'feature_types', 'genome'

So I assume the data is in the correct format with rows being # of cells and columns being # of genes.

But if run the scrub code on this object it runs into a loop. If I interrupt the code it give the following error. Is there a step I am missing between loading the .h5 file in python and getting scrublet to process it?

Thanks, Aditi

KeyboardInterrupt Traceback (most recent call last) Cell In[8], line 1 ----> 1 scrub = scr.Scrublet(count_matrix, expected_doublet_rate=0.06)

File ~/anaconda3/envs/scRNA_Diabetes/lib/python3.10/site-packages/scrublet/scrublet.py:101, in Scrublet.init(self, counts_matrix, total_counts, sim_doublet_ratio, n_neighbors, expected_doublet_rate, stdev_doublet_rate, random_state) 7 ''' Initialize Scrublet object with counts matrix and doublet prediction parameters 8 9 Parameters (...) 97 the doublet neighbors of transcriptome i). 98 ''' 100 if not scipy.sparse.issparse(counts_matrix): --> 101 counts_matrix = scipy.sparse.csc_matrix(counts_matrix) 102 elif not scipy.sparse.isspmatrix_csc(counts_matrix): 103 counts_matrix = counts_matrix.tocsc()

File ~/anaconda3/envs/scRNA_Diabetes/lib/python3.10/site-packages/scipy/sparse/_compressed.py:79, in _cs_matrix.init(self, arg1, shape, dtype, copy) 76 else: 77 # must be dense 78 try: ---> 79 arg1 = np.asarray(arg1) 80 except Exception as e: 81 raise ValueError("unrecognized {}_matrix constructor usage" 82 "".format(self.format)) from e

File ~/anaconda3/envs/scRNA_Diabetes/lib/python3.10/site-packages/anndata/_core/anndata.py:1171, in AnnData.getitem(self, index) 1169 """Returns a sliced view of the object.""" 1170 oidx, vidx = self._normalize_indices(index) -> 1171 return AnnData(self, oidx=oidx, vidx=vidx, asview=True)

File ~/anaconda3/envs/scRNA_Diabetes/lib/python3.10/site-packages/anndata/_core/anndata.py:360, in AnnData.init(self, X, obs, var, uns, obsm, varm, layers, raw, dtype, shape, filename, filemode, asview, obsp, varp, oidx, vidx) 358 if not isinstance(X, AnnData): 359 raise ValueError("X has to be an AnnData object.") --> 360 self._init_as_view(X, oidx, vidx) 361 else: 362 self._init_as_actual( 363 X=X, 364 obs=obs, (...) 376 filemode=filemode, 377 )

File ~/anaconda3/envs/scRNA_Diabetes/lib/python3.10/site-packages/anndata/_core/anndata.py:408, in AnnData._init_as_view(self, adata_ref, oidx, vidx) 406 # views on attributes of adata_ref 407 obs_sub = adata_ref.obs.iloc[oidx] --> 408 var_sub = adata_ref.var.iloc[vidx] 409 self._obsm = adata_ref.obsm._view(self, (oidx,)) 410 self._varm = adata_ref.varm._view(self, (vidx,))

File ~/anaconda3/envs/scRNA_Diabetes/lib/python3.10/site-packages/pandas/core/indexing.py:1153, in _LocationIndexer.getitem(self, key) 1150 axis = self.axis or 0 1152 maybe_callable = com.apply_if_callable(key, self.obj) -> 1153 return self._getitem_axis(maybe_callable, axis=axis)

File ~/anaconda3/envs/scRNA_Diabetes/lib/python3.10/site-packages/pandas/core/indexing.py:1691, in _iLocIndexer._getitem_axis(self, key, axis) 1685 raise IndexError( 1686 "DataFrame indexer is not allowed for .iloc\n" 1687 "Consider using .loc for automatic alignment." 1688 ) 1690 if isinstance(key, slice): -> 1691 return self._get_slice_axis(key, axis=axis) 1693 if is_iterator(key): 1694 key = list(key)

File ~/anaconda3/envs/scRNA_Diabetes/lib/python3.10/site-packages/pandas/core/indexing.py:1727, in _iLocIndexer._get_slice_axis(self, slice_obj, axis) 1725 labels = obj._get_axis(axis) 1726 labels._validate_positional_slice(slice_obj) -> 1727 return self.obj._slice(slice_obj, axis=axis)

File ~/anaconda3/envs/scRNA_Diabetes/lib/python3.10/site-packages/pandas/core/generic.py:4304, in NDFrame._slice(self, slobj, axis) 4302 assert isinstance(slobj, slice), type(slobj) 4303 axis = self._get_block_manager_axis(axis) -> 4304 new_mgr = self._mgr.get_slice(slobj, axis=axis) 4305 result = self._constructor_from_mgr(new_mgr, axes=new_mgr.axes) 4306 result = result.finalize(self)

File internals.pyx:929, in pandas._libs.internals.BlockManager.get_slice()

File internals.pyx:913, in pandas._libs.internals.BlockManager._slice_mgr_rows()

File ~/anaconda3/envs/scRNA_Diabetes/lib/python3.10/site-packages/pandas/core/indexes/base.py:5391, in Index._getitem_slice(self, slobj) 5387 # NB: Using _constructor._simple_new would break if MultiIndex 5388 # didn't override getitem 5389 return self._constructor._simple_new(result, name=self._name) -> 5391 def _getitem_slice(self, slobj: slice) -> Self: 5392 """ 5393 Fastpath for getitem when we know we have a slice. 5394 """ 5395 res = self._data[slobj]

KeyboardInterrupt:

Oct 17 '23 02:10 aditinarsale

scrublet scrublet copied to clipboard

Scrublet runs into an endless loop

scrublet
scrublet copied to clipboard