scirpy icon indicating copy to clipboard operation
scirpy copied to clipboard

scirpy.tl.chain_qc() -- KeyError: 'has_ir'

Open brian-shim opened this issue 3 years ago • 1 comments

Description of the bug

Hi! I am a first-time user and am having issues getting started with a Scirpy workflow on my dataset.

I prepare and preprocess my AnnData object in the same way as described in the Scirpy tutorial, and also run [ scirpy.io.upgrade_schema() ].

However, when I attempt to run [ scirpy.tl.chain_qc() ], I am returned with a long error which appears to indicate that the function keys do not exist in my AnnData.

Minimal reproducible example

import scirpy as ir

t = sc.read_h5ad('t.h5ad')
ir.io.upgrade_schema(t)

sc.pp.filter_genes(t, min_cells=10)
sc.pp.filter_cells(t, min_genes=100)
sc.pp.normalize_per_cell(t, counts_per_cell_after=1000)
sc.pp.log1p(t)
sc.pp.highly_variable_genes(t, flavor="cell_ranger", n_top_genes=5000)
sc.tl.pca(t)
sc.pp.neighbors(t)
sc.tl.leiden(t)
sc.tl.umap(t)

ir.tl.chain_qc(t)

The error message produced by the code above

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
File /broad/hahnlab_ce/brianshim/scvi_env/lib/python3.9/site-packages/pandas/core/indexes/base.py:3621, in Index.get_loc(self, key, method, tolerance)
   3620 try:
-> 3621     return self._engine.get_loc(casted_key)
   3622 except KeyError as err:

File /broad/hahnlab_ce/brianshim/scvi_env/lib/python3.9/site-packages/pandas/_libs/index.pyx:136, in pandas._libs.index.IndexEngine.get_loc()

File /broad/hahnlab_ce/brianshim/scvi_env/lib/python3.9/site-packages/pandas/_libs/index.pyx:163, in pandas._libs.index.IndexEngine.get_loc()

File pandas/_libs/hashtable_class_helper.pxi:5198, in pandas._libs.hashtable.PyObjectHashTable.get_item()

File pandas/_libs/hashtable_class_helper.pxi:5206, in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'has_ir'

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
Input In [70], in <cell line: 1>()
----> 1 ir.tl.chain_qc(t)

File /broad/hahnlab_ce/brianshim/scvi_env/lib/python3.9/site-packages/scirpy/io/_util.py:67, in _check_upgrade_schema.<locals>.check_upgrade_schema_decorator.<locals>.check_wrapper(*args, **kwargs)
     65 for i in check_args:
     66     _check_anndata_upgrade_schema(args[i])
---> 67 return f(*args, **kwargs)

File /broad/hahnlab_ce/brianshim/scvi_env/lib/python3.9/site-packages/scirpy/tl/_chain_qc.py:109, in chain_qc(adata, inplace, key_added)
    106 res_receptor_type = np.empty(dtype=f"<U{string_length}", shape=(x.shape[0],))
    107 res_receptor_subtype = np.empty(dtype=f"<U{string_length}", shape=(x.shape[0],))
--> 109 mask_has_ir = _is_true(x["has_ir"].values)
    110 mask_multichain = mask_has_ir & _is_true(x["multi_chain"].values)
    112 vj_loci = x.loc[:, ["IR_VJ_1_locus", "IR_VJ_2_locus"]].values

File /broad/hahnlab_ce/brianshim/scvi_env/lib/python3.9/site-packages/pandas/core/frame.py:3505, in DataFrame.__getitem__(self, key)
   3503 if self.columns.nlevels > 1:
   3504     return self._getitem_multilevel(key)
-> 3505 indexer = self.columns.get_loc(key)
   3506 if is_integer(indexer):
   3507     indexer = [indexer]

File /broad/hahnlab_ce/brianshim/scvi_env/lib/python3.9/site-packages/pandas/core/indexes/base.py:3623, in Index.get_loc(self, key, method, tolerance)
   3621     return self._engine.get_loc(casted_key)
   3622 except KeyError as err:
-> 3623     raise KeyError(key) from err
   3624 except TypeError:
   3625     # If we have a listlike key, _check_indexing_error will raise
   3626     #  InvalidIndexError. Otherwise we fall through and re-raise
   3627     #  the TypeError.
   3628     self._check_indexing_error(key)

KeyError: 'has_ir'

Version information

versions
-----
anndata     0.8.0
scanpy      1.9.1
-----
Levenshtein                 NA
PIL                         9.1.1
adjustText                  NA
airr                        1.3.1
asttokens                   NA
backcall                    0.2.0
beta_ufunc                  NA
binom_ufunc                 NA
cffi                        1.15.0
colorama                    0.4.4
cycler                      0.10.0
cython_runtime              NA
dateutil                    2.8.2
debugpy                     1.6.0
decorator                   5.1.1
defusedxml                  0.7.1
entrypoints                 0.4
executing                   0.8.3
google                      NA
h5py                        3.2.1
igraph                      0.9.10
ipykernel                   6.13.0
ipython_genutils            0.2.0
ipywidgets                  7.7.0
jedi                        0.18.1
joblib                      1.1.0
jupyter_server              1.17.0
kiwisolver                  1.4.2
leidenalg                   0.8.10
llvmlite                    0.38.0
matplotlib                  3.5.2
matplotlib_inline           NA
mpl_toolkits                NA
natsort                     8.1.0
nbinom_ufunc                NA
networkx                    2.8.1
numba                       0.55.1
numpy                       1.21.6
packaging                   21.3
pandas                      1.4.2
parasail                    1.2.4
parso                       0.8.3
pexpect                     4.8.0
pickleshare                 0.7.5
pkg_resources               NA
prompt_toolkit              3.0.29
psutil                      5.9.0
ptyprocess                  0.7.0
pure_eval                   0.2.2
pycparser                   2.21
pydev_ipython               NA
pydevconsole                NA
pydevd                      2.8.0
pydevd_file_utils           NA
pydevd_plugins              NA
pydevd_tracing              NA
pygments                    2.12.0
pynndescent                 0.5.7
pyparsing                   3.0.9
pytoml                      NA
pytz                        2022.1
scipy                       1.7.3
scirpy                      0.10.1
seaborn                     0.11.2
session_info                1.0.0
setuptools_scm              NA
six                         1.16.0
sklearn                     1.0.2
sphinxcontrib               NA
stack_data                  0.2.0
statsmodels                 0.13.2
texttable                   1.6.4
threadpoolctl               3.1.0
tornado                     6.1
tqdm                        4.64.0
tracerlib                   NA
traitlets                   5.2.1.post0
typing_extensions           NA
umap                        0.5.3
wcwidth                     0.2.5
yaml                        6.0
yamlordereddictloader       NA
zmq                         23.0.0
-----
IPython             8.3.0
jupyter_client      7.3.1
jupyter_core        4.10.0
jupyterlab          3.4.2
notebook            6.4.11
-----
Python 3.9.12 | packaged by conda-forge | (main, Mar 24 2022, 23:25:59) [GCC 10.3.0]
Linux-3.10.0-1160.59.1.el7.x86_64-x86_64-with-glibc2.17
-----

brian-shim avatar Jun 06 '22 20:06 brian-shim

Hi @brian-shim,

thanks for reporting this. This could be an issue in upgrade_schema.

To confirm, could you please report the output of

t.obs.columns

before and after running upgrade_schema?

Cheers, Gregor

grst avatar Jun 07 '22 14:06 grst

As we are moving towards the new scirpy data structure (currently available as release candidate v0.13.0rc1), this error is not applicable anymore.

I'm therefore closing this issue.

grst avatar Apr 07 '23 17:04 grst