scgen icon indicating copy to clipboard operation
scgen copied to clipboard

labels_key mismatch in scgen.SCGEN.setup_anndata()

Open johnjeang opened this issue 2 years ago • 0 comments

Hello, I am trying to run batch correction on some single-cell RNAseq data. This data has some cell type labels, so I thought the scGEN method would be a good fit.

For reference I am following the google colab tutorial here https://colab.research.google.com/github/theislab/scgen/blob/master/docs/tutorials/scgen_batch_removal.ipynb#scrollTo=OMMhgkQlpb8s

When trying to run

scgen.SCGEN.setup_anndata(train, batch_key="source", labels_key="cell_type")

I get the following error related to my labels that indicates some kind of mismatch. The label names are actually the same, but it looks like there is some data structure issue causing this mismatch error?

ValueError: Making .obs["cell_type"] categorical failed. Expected categories: ['astro' 'endothelial' 'microglia' 'neuron' 'oligo' 'opc' 'tcell' 'unknown']. Received categories: Index(['astro', 'endothelial', 'microglia', 'neuron', 'oligo', 'opc', 'tcell', 'unknown'], dtype='object').

I am working in Python 3.8.3 scanpy 1.9.1 scgen 2.1.0

Any ideas on how to solve this issue?

johnjeang avatar May 06 '22 14:05 johnjeang