scanpy icon indicating copy to clipboard operation
scanpy copied to clipboard

Could not find keys in columns of adata.obs or in data.var_names

Open stephwon opened this issue 2 years ago • 1 comments

  • [x] I have checked that this issue has not already been reported.
  • [x] I have confirmed this bug exists on the latest version of scanpy.
  • [ ] (optional) I have confirmed this bug exists on the master branch of scanpy.

Trying to run through Pearson residual but when trying to perform 'Plot quality control metrics' produced an error saying it could not find keys in adata.obs or in data.var_names.

Minimal code sample (that we can copy&paste without having any data)

# adata_control = sc.read_csv('/Users/csb/Desktop/SevenBridge_custom reference remapping_2022.7.21/Sample_sample_Control_WTA/Control_new.csv')
adata_control.uns["name"] = "zfish_Control"

for adata in adata_control:
    adata.var_names_make_unique()
    print(adata.uns["name"], ": data shape", adata.shape)
    sc.pp.filter_cells(adata, min_genes=100)
    sc.pp.filter_genes(adata, min_cells=100)

for adata in adata_control:
    adata.var['mt'] = adata.var_names.str.startswith('mt-')
    sc.pp.calculate_qc_metrics(
        adata, qc_vars=['mt'], percent_top=None, log1p=False, inplace=True

for adata in adata_control:
    print(adata.uns["name"], ":")
    sc.pl.violin(
    adata,
    ["n_genes_by_counts", "total_counts", "pct_counts_mt"],
    jitter =0.4,
    multi_panel=True,
    )
zfish_Control :
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
/var/folders/bh/yzgycbkj1jn1bd9p52t5s8s40000gn/T/ipykernel_17867/842143982.py in <module>
      1 for adata in adata_control:
      2     print(adata.uns["name"], ":")
----> 3     sc.pl.violin(
      4     adata,
      5     ["n_genes_by_counts", "total_counts", "pct_counts_mt"],

~/opt/anaconda3/lib/python3.9/site-packages/scanpy/plotting/_anndata.py in violin(adata, keys, groupby, log, use_raw, stripplot, jitter, size, layer, scale, order, multi_panel, xlabel, ylabel, rotation, show, save, ax, **kwds)
    779             )
    780     else:
--> 781         obs_df = get.obs_df(adata, keys=keys, layer=layer, use_raw=use_raw)
    782     if groupby is None:
    783         obs_tidy = pd.melt(obs_df, value_vars=keys)

~/opt/anaconda3/lib/python3.9/site-packages/scanpy/get/get.py in obs_df(adata, keys, obsm_keys, layer, gene_symbols, use_raw)
    270         alias_index = None
    271 
--> 272     obs_cols, var_idx_keys, var_symbols = _check_indices(
    273         adata.obs,
    274         var.index,

~/opt/anaconda3/lib/python3.9/site-packages/scanpy/get/get.py in _check_indices(dim_df, alt_index, dim, keys, alias_index, use_raw)
    165             not_found.append(key)
    166     if len(not_found) > 0:
--> 167         raise KeyError(
    168             f"Could not find keys '{not_found}' in columns of `adata.{dim}` or in"
    169             f" {alt_repr}.{alt_search_repr}."

KeyError: "Could not find keys '['n_genes_by_counts', 'pct_counts_mt', 'total_counts']' in columns of `adata.obs` or in adata.var_names."
]

Versions

anndata 0.8.0 scanpy 1.9.1

PIL 8.4.0 anyio NA appnope 0.1.2 attr 21.2.0 babel 2.9.1 backcall 0.2.0 beta_ufunc NA binom_ufunc NA bottleneck 1.3.2 brotli NA certifi 2022.06.15 cffi 1.14.6 chardet 4.0.0 charset_normalizer 2.0.4 cloudpickle 2.0.0 colorama 0.4.4 cycler 0.10.0 cython_runtime NA cytoolz 0.11.0 dask 2021.10.0 dateutil 2.8.2 debugpy 1.4.1 decorator 5.1.0 defusedxml 0.7.1 entrypoints 0.3 fastjsonschema NA fsspec 2021.08.1 google NA h5py 3.2.1 idna 3.2 igraph 0.9.11 ipykernel 6.4.1 ipython_genutils 0.2.0 ipywidgets 7.6.5 jedi 0.18.0 jinja2 2.11.3 joblib 1.1.0 json5 NA jsonschema 3.2.0 jupyter_server 1.4.1 jupyterlab_server 2.8.2 kiwisolver 1.3.1 leidenalg 0.8.10 llvmlite 0.37.0 louvain 0.7.1 markupsafe 1.1.1 matplotlib 3.4.3 matplotlib_inline NA mkl 2.4.0 mpl_toolkits NA natsort 8.1.0 nbclassic NA nbformat 5.1.3 nbinom_ufunc NA numba 0.54.1 numexpr 2.7.3 numpy 1.20.3 packaging 21.0 pandas 1.4.2 parso 0.8.2 pexpect 4.8.0 pickleshare 0.7.5 pkg_resources NA prometheus_client NA prompt_toolkit 3.0.20 psutil 5.8.0 ptyprocess 0.7.0 pvectorc NA pycparser 2.20 pydev_ipython NA pydevconsole NA pydevd 2.4.1 pydevd_concurrency_analyser NA pydevd_file_utils NA pydevd_plugins NA pydevd_tracing NA pyexpat NA pygments 2.10.0 pynndescent 0.5.7 pyparsing 3.0.4 pyrsistent NA pytz 2021.3 requests 2.26.0 scipy 1.7.1 seaborn 0.11.2 send2trash NA session_info 1.0.0 six 1.16.0 sklearn 0.24.2 snappy NA sniffio 1.2.0 socks 1.7.1 sphinxcontrib NA statsmodels 0.12.2 storemagic NA tblib 1.7.0 texttable 1.6.4 tlz 0.11.0 toolz 0.11.1 tornado 6.1 tqdm 4.62.3 traitlets 5.1.0 typing_extensions NA umap 0.5.3 urllib3 1.26.7 wcwidth 0.2.5 yaml 6.0 zmq 22.2.1 zope NA

IPython 7.29.0 jupyter_client 6.1.12 jupyter_core 4.8.1 jupyterlab 3.2.1 notebook 6.4.5

Python 3.9.7 (default, Sep 16 2021, 08:50:36) [Clang 10.0.0 ]

Some additional info, the raw data does not have barcode but it is a umi count data.
print(adata.obs_names)
print()
print(adata.var_names)
Index(['0'], dtype='object')

Index(['', 'ABCA7', 'ABCD2', 'ABR', 'ACADSB', 'ACAP2', 'ACBD3', 'ACKR2',
       'ACOT12', 'ACSF3',
       ...
       'si:dkey-117m1.4', 'si:dkey-118j18.2', 'si:dkey-118k5.3',
       'si:dkey-119f1.1', 'si:dkey-119m7.8', 'si:dkey-11c5.11',
       'si:dkey-11d20.1', 'si:dkey-11e23.9', 'si:dkey-11f4.16',
       'si:dkey-11f4.20'],
      dtype='object', length=16384)

stephwon avatar Aug 02 '22 06:08 stephwon

hi @pacificoceanmist , which scanpy version are you using, and could you update to latest version ?

giovp avatar Aug 10 '22 11:08 giovp

We will close the issue for now, hopefully you obtained the expected behaviour :)

However, please don't hesitate to reopen this issue or create a new one if you have any more questions or run into any related problems in the future.

Thanks for being a part of our community! :)

eroell avatar Oct 12 '23 09:10 eroell