scirpy
scirpy copied to clipboard
scirpy.tl.chain_qc() -- KeyError: 'has_ir'
Description of the bug
Hi! I am a first-time user and am having issues getting started with a Scirpy workflow on my dataset.
I prepare and preprocess my AnnData object in the same way as described in the Scirpy tutorial, and also run [ scirpy.io.upgrade_schema() ].
However, when I attempt to run [ scirpy.tl.chain_qc() ], I am returned with a long error which appears to indicate that the function keys do not exist in my AnnData.
Minimal reproducible example
import scirpy as ir
t = sc.read_h5ad('t.h5ad')
ir.io.upgrade_schema(t)
sc.pp.filter_genes(t, min_cells=10)
sc.pp.filter_cells(t, min_genes=100)
sc.pp.normalize_per_cell(t, counts_per_cell_after=1000)
sc.pp.log1p(t)
sc.pp.highly_variable_genes(t, flavor="cell_ranger", n_top_genes=5000)
sc.tl.pca(t)
sc.pp.neighbors(t)
sc.tl.leiden(t)
sc.tl.umap(t)
ir.tl.chain_qc(t)
The error message produced by the code above
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
File /broad/hahnlab_ce/brianshim/scvi_env/lib/python3.9/site-packages/pandas/core/indexes/base.py:3621, in Index.get_loc(self, key, method, tolerance)
3620 try:
-> 3621 return self._engine.get_loc(casted_key)
3622 except KeyError as err:
File /broad/hahnlab_ce/brianshim/scvi_env/lib/python3.9/site-packages/pandas/_libs/index.pyx:136, in pandas._libs.index.IndexEngine.get_loc()
File /broad/hahnlab_ce/brianshim/scvi_env/lib/python3.9/site-packages/pandas/_libs/index.pyx:163, in pandas._libs.index.IndexEngine.get_loc()
File pandas/_libs/hashtable_class_helper.pxi:5198, in pandas._libs.hashtable.PyObjectHashTable.get_item()
File pandas/_libs/hashtable_class_helper.pxi:5206, in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 'has_ir'
The above exception was the direct cause of the following exception:
KeyError Traceback (most recent call last)
Input In [70], in <cell line: 1>()
----> 1 ir.tl.chain_qc(t)
File /broad/hahnlab_ce/brianshim/scvi_env/lib/python3.9/site-packages/scirpy/io/_util.py:67, in _check_upgrade_schema.<locals>.check_upgrade_schema_decorator.<locals>.check_wrapper(*args, **kwargs)
65 for i in check_args:
66 _check_anndata_upgrade_schema(args[i])
---> 67 return f(*args, **kwargs)
File /broad/hahnlab_ce/brianshim/scvi_env/lib/python3.9/site-packages/scirpy/tl/_chain_qc.py:109, in chain_qc(adata, inplace, key_added)
106 res_receptor_type = np.empty(dtype=f"<U{string_length}", shape=(x.shape[0],))
107 res_receptor_subtype = np.empty(dtype=f"<U{string_length}", shape=(x.shape[0],))
--> 109 mask_has_ir = _is_true(x["has_ir"].values)
110 mask_multichain = mask_has_ir & _is_true(x["multi_chain"].values)
112 vj_loci = x.loc[:, ["IR_VJ_1_locus", "IR_VJ_2_locus"]].values
File /broad/hahnlab_ce/brianshim/scvi_env/lib/python3.9/site-packages/pandas/core/frame.py:3505, in DataFrame.__getitem__(self, key)
3503 if self.columns.nlevels > 1:
3504 return self._getitem_multilevel(key)
-> 3505 indexer = self.columns.get_loc(key)
3506 if is_integer(indexer):
3507 indexer = [indexer]
File /broad/hahnlab_ce/brianshim/scvi_env/lib/python3.9/site-packages/pandas/core/indexes/base.py:3623, in Index.get_loc(self, key, method, tolerance)
3621 return self._engine.get_loc(casted_key)
3622 except KeyError as err:
-> 3623 raise KeyError(key) from err
3624 except TypeError:
3625 # If we have a listlike key, _check_indexing_error will raise
3626 # InvalidIndexError. Otherwise we fall through and re-raise
3627 # the TypeError.
3628 self._check_indexing_error(key)
KeyError: 'has_ir'
Version information
versions
-----
anndata 0.8.0
scanpy 1.9.1
-----
Levenshtein NA
PIL 9.1.1
adjustText NA
airr 1.3.1
asttokens NA
backcall 0.2.0
beta_ufunc NA
binom_ufunc NA
cffi 1.15.0
colorama 0.4.4
cycler 0.10.0
cython_runtime NA
dateutil 2.8.2
debugpy 1.6.0
decorator 5.1.1
defusedxml 0.7.1
entrypoints 0.4
executing 0.8.3
google NA
h5py 3.2.1
igraph 0.9.10
ipykernel 6.13.0
ipython_genutils 0.2.0
ipywidgets 7.7.0
jedi 0.18.1
joblib 1.1.0
jupyter_server 1.17.0
kiwisolver 1.4.2
leidenalg 0.8.10
llvmlite 0.38.0
matplotlib 3.5.2
matplotlib_inline NA
mpl_toolkits NA
natsort 8.1.0
nbinom_ufunc NA
networkx 2.8.1
numba 0.55.1
numpy 1.21.6
packaging 21.3
pandas 1.4.2
parasail 1.2.4
parso 0.8.3
pexpect 4.8.0
pickleshare 0.7.5
pkg_resources NA
prompt_toolkit 3.0.29
psutil 5.9.0
ptyprocess 0.7.0
pure_eval 0.2.2
pycparser 2.21
pydev_ipython NA
pydevconsole NA
pydevd 2.8.0
pydevd_file_utils NA
pydevd_plugins NA
pydevd_tracing NA
pygments 2.12.0
pynndescent 0.5.7
pyparsing 3.0.9
pytoml NA
pytz 2022.1
scipy 1.7.3
scirpy 0.10.1
seaborn 0.11.2
session_info 1.0.0
setuptools_scm NA
six 1.16.0
sklearn 1.0.2
sphinxcontrib NA
stack_data 0.2.0
statsmodels 0.13.2
texttable 1.6.4
threadpoolctl 3.1.0
tornado 6.1
tqdm 4.64.0
tracerlib NA
traitlets 5.2.1.post0
typing_extensions NA
umap 0.5.3
wcwidth 0.2.5
yaml 6.0
yamlordereddictloader NA
zmq 23.0.0
-----
IPython 8.3.0
jupyter_client 7.3.1
jupyter_core 4.10.0
jupyterlab 3.4.2
notebook 6.4.11
-----
Python 3.9.12 | packaged by conda-forge | (main, Mar 24 2022, 23:25:59) [GCC 10.3.0]
Linux-3.10.0-1160.59.1.el7.x86_64-x86_64-with-glibc2.17
-----
Hi @brian-shim,
thanks for reporting this.
This could be an issue in upgrade_schema.
To confirm, could you please report the output of
t.obs.columns
before and after running upgrade_schema?
Cheers, Gregor
As we are moving towards the new scirpy data structure (currently available as release candidate v0.13.0rc1), this error is not applicable anymore.
I'm therefore closing this issue.