anndata
anndata copied to clipboard
Seeming incompatibility with the numpy matrix subclass
Please make sure these conditions are met
- [X] I have checked that this issue has not already been reported.
- [X] I have confirmed this bug exists on the latest version of scanpy.
- [ ] (optional) I have confirmed this bug exists on the master branch of scanpy.
What happened?
For reasons I could not explain, I have found that the method filter_genes
causes a ValueError
when the adata.X
object is of the type numpy.matrix
. It is easy to circumvent by converting it to a general ndarray
, but I wanted to file it here for reference, as matrix
objects are still given by default by some methods (such as the todense()
method of a sparse matrix) and matrix
is a subclass of ndarray
so it is not easy to identify it as a type error. Here is a minimal code sample
Minimal code sample
import scanpy as sc
import anndata
import numpy as np
import pandas as pd
X = np.matrix([[1, 2], [3, 0]])
print(isinstance(X, np.ndarray))
ad = anndata.AnnData(X=X, obs={'obs_names': ['a', 'b']}, var={'vars_names': ['v1', 'v2']})
sc.pp.filter_genes(ad, min_cells=2)
Error output
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[47], line 9
7 print(isinstance(X, np.ndarray))
8 ad = anndata.AnnData(X=X, obs={'obs_names': ['a', 'b']}, var={'vars_names': ['v1', 'v2']})
----> 9 sc.pp.filter_genes(ad, min_cells=2)
File ~/miniconda3/envs/bio/lib/python3.10/site-packages/scanpy/preprocessing/_simple.py:250, in filter_genes(data, min_counts, min_cells, max_counts, max_cells, inplace, copy)
248 adata.var['n_counts'] = number
249 else:
--> 250 adata.var['n_cells'] = number
251 adata._inplace_subset_var(gene_subset)
252 return adata if copy else None
File ~/miniconda3/envs/bio/lib/python3.10/site-packages/pandas/core/frame.py:3950, in DataFrame.__setitem__(self, key, value)
3947 self._setitem_array([key], value)
3948 else:
3949 # set column
-> 3950 self._set_item(key, value)
File ~/miniconda3/envs/bio/lib/python3.10/site-packages/pandas/core/frame.py:4143, in DataFrame._set_item(self, key, value)
4133 def _set_item(self, key, value) -> None:
4134 """
4135 Add series to DataFrame in specified column.
4136
(...)
4141 ensure homogeneity.
4142 """
-> 4143 value = self._sanitize_column(value)
4145 if (
4146 key in self.columns
4147 and value.ndim == 1
4148 and not is_extension_array_dtype(value)
4149 ):
4150 # broadcast across multiple columns if necessary
4151 if not self.columns.is_unique or isinstance(self.columns, MultiIndex):
File ~/miniconda3/envs/bio/lib/python3.10/site-packages/pandas/core/frame.py:4870, in DataFrame._sanitize_column(self, value)
4867 return _reindex_for_setitem(Series(value), self.index)
4869 if is_list_like(value):
-> 4870 com.require_length_match(value, self.index)
4871 return sanitize_array(value, self.index, copy=True, allow_2d=True)
File ~/miniconda3/envs/bio/lib/python3.10/site-packages/pandas/core/common.py:576, in require_length_match(data, index)
572 """
573 Check the length of data matches the length of the index.
574 """
575 if len(data) != len(index):
--> 576 raise ValueError(
577 "Length of values "
578 f"({len(data)}) "
579 "does not match length of index "
580 f"({len(index)})"
581 )
ValueError: Length of values (1) does not match length of index (2)
Versions
-----
anndata 0.10.2
scanpy 1.9.6
-----
PIL 9.4.0
appnope 0.1.3
asttokens NA
backcall 0.2.0
bottleneck 1.3.5
cffi 1.15.1
comm 0.1.3
cycler 0.10.0
cython_runtime NA
dateutil 2.8.2
debugpy 1.6.7
decorator 5.1.1
exceptiongroup 1.1.3
executing 1.2.0
google NA
h5py 3.9.0
ipykernel 6.23.1
jedi 0.18.2
joblib 1.3.2
kiwisolver 1.4.4
llvmlite 0.41.1
matplotlib 3.7.2
mkl 2.4.0
mpl_toolkits NA
natsort 8.4.0
numba 0.58.1
numexpr 2.8.4
numpy 1.25.2
packaging 23.1
pandas 2.0.3
parso 0.8.3
pexpect 4.8.0
pickleshare 0.7.5
pkg_resources NA
platformdirs 3.5.1
prompt_toolkit 3.0.38
psutil 5.9.0
ptyprocess 0.7.0
pure_eval 0.2.2
pydev_ipython NA
pydevconsole NA
pydevd 2.9.5
pydevd_file_utils NA
pydevd_plugins NA
pydevd_tracing NA
pygments 2.15.1
pyparsing 3.0.9
pytz 2022.7
scipy 1.11.1
session_info 1.0.0
six 1.16.0
sklearn 1.3.0
stack_data 0.6.2
threadpoolctl 3.2.0
tornado 6.3.2
traitlets 5.9.0
typing_extensions NA
wcwidth 0.2.6
yaml 6.0.1
zmq 25.1.0
zoneinfo NA
-----
IPython 8.14.0
jupyter_client 8.2.0
jupyter_core 5.3.0
-----
Python 3.10.10 (main, Mar 21 2023, 13:41:39) [Clang 14.0.6 ]
macOS-10.16-x86_64-i386-64bit
-----
Session information updated at 2024-01-24 17:43