scanpy icon indicating copy to clipboard operation
scanpy copied to clipboard

highly_variable_genes AssertionError: Don’t call _normalize_index with non-categorical/string names

Open r-panero opened this issue 5 years ago • 9 comments

Hi, I am using anndata 0.6.21 and scanpy 1.4.3 I executed this code:

sc.pp.highly_variable_genes(adata, min_mean=0.0001, max_mean=3, min_disp=0.5)

sc.pl.highly_variable_genes(adata)

adata = adata[:, adata.var['highly_variable']]

and I got this error: AssertionError: Don’t call _normalize_index with non-categorical/string names Can you help me?

Thank you.

r-panero avatar Jul 25 '19 12:07 r-panero

Hi. I just tried running that, and wasn't able to reproduce that error. Here's what I ran:

import scanpy as sc

adata = sc.datasets.pbmc3k()
sc.pp.filter_genes(adata, min_counts=1)
sc.pp.log1p(adata)
sc.pp.highly_variable_genes(adata, min_mean=0.0001, max_mean=3, min_disp=0.5)
sc.pl.highly_variable_genes(adata)
adata = adata[:, adata.var['highly_variable']]

Could you update to the latest releases (scanpy 1.4.4, anndata 0.6.22) and try that?

ivirshup avatar Jul 29 '19 04:07 ivirshup

Hi, I converted the obs and values in string and it worked. Thanks.

On Mon, 29 Jul 2019 at 06:59, Isaac Virshup [email protected] wrote:

Hi. I just tried running that, and wasn't able to reproduce that error. Here's what I ran:

import scanpy as sc

adata = sc.datasets.pbmc3k() sc.pp.filter_genes(adata, min_counts=1) sc.pp.log1p(adata) sc.pp.highly_variable_genes(adata, min_mean=0.0001, max_mean=3, min_disp=0.5) sc.pl.highly_variable_genes(adata) adata = adata[:, adata.var['highly_variable']]

Could you update to the latest releases (scanpy 1.4.4, anndata 0.6.22) and try that?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/theislab/scanpy/issues/747?email_source=notifications&email_token=ACPDY4U77PLSKFM4ZNQRBYLQBZ2LBA5CNFSM4IG2HWJ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD27SWAA#issuecomment-515844864, or mute the thread https://github.com/notifications/unsubscribe-auth/ACPDY4VLMX7TXWMWLRDBTPLQBZ2LBANCNFSM4IG2HWJQ .

ghost avatar Jul 29 '19 18:07 ghost

That'll do it 😄

Do you know how you ended up with non-string indices? Ideally, we would be able to prevent that from happening or at least warn the user about it.

ivirshup avatar Jul 30 '19 04:07 ivirshup

I created an adata without using the functions provided by scanpy that allow you to load single cell data. This kind of conversion is done is done in that functions, right?

On Tue, Jul 30, 2019, 06:17 Isaac Virshup [email protected] wrote:

That'll do it 😄

Do you know how you ended up with non-string indices? Ideally, we would be able to prevent that from happening or at least warn the user about it.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/theislab/scanpy/issues/747?email_source=notifications&email_token=ACPDY4T3AAWXADJLTLCIMW3QB66FRA5CNFSM4IG2HWJ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3CWRPY#issuecomment-516253887, or mute the thread https://github.com/notifications/unsubscribe-auth/ACPDY4TBMBQHRYTRFHJCWJTQB66FRANCNFSM4IG2HWJQ .

ghost avatar Jul 30 '19 05:07 ghost

I had the same problem when I loaded sample data from a csv as a data frame and assigned it to adata.obs = df

maximilianh avatar Jul 30 '19 06:07 maximilianh

I believe this happens whenever the object is created, so if you ran something like:

adata =  AnnData(X, ...)

That should always have string obs and var names. I believe we throw a warning if you try and set .obs and .var directly with non string indices.

ivirshup avatar Jul 30 '19 12:07 ivirshup

I had the same problem when I loaded sample data from a csv as a data frame and assigned it to adata.obs = df

I meet the same problem when I try replace the adata.obs with annother pandas dataframe

xiachenrui avatar Jan 22 '22 12:01 xiachenrui

Solution here is adata.obs.index = adata.obs.index.astype(str). Should be called by default if this assertionerror is raised.

scottgigante-immunai avatar Sep 09 '22 16:09 scottgigante-immunai

This just happened to me, only when the matrix was in .mtx format. I fixed it with adata.var.index = adata.var.index.astype(str) Scanpy version was 1.4.5.1

maximilianh avatar Sep 22 '22 12:09 maximilianh

Thanks everyone for the discussion here!

Will close the issue for now, as based on the provided information and the discussion so far, it seems that the issues have been addressed and hopefully resolved :)

However, please don't hesitate to reopen this issue or create a new one if you have any more questions or run into any related problems in the future.

Thanks for being a part of our community! :)

eroell avatar Oct 12 '23 09:10 eroell