squidpy icon indicating copy to clipboard operation
squidpy copied to clipboard

neighborhood enrichment "nan" encountered in zscore result

Open frankligy opened this issue 2 years ago • 3 comments

When running sq.gr.nhood_enrichment function, I got the warning below:

100%|██████████| 1000/1000 [00:00<00:00, 2079.65/s]
/opt/anaconda3/envs/sc_env/lib/python3.7/site-packages/squidpy/gr/_nhood.py:182: RuntimeWarning: invalid value encountered in true_divide
  zscore = (count - perms.mean(axis=0)) / perms.std(axis=0)

Then when I looked at the resultant z-score matrix in adata.uns['cluster_nhood_enrichment']['zscore'], I found that a few entries are nan:

Screen Shot 2022-05-24 at 11 52 57 AM

This directly lead to the following sq.pl.nhood_enrichment can not work properly:

Traceback (most recent call last):
  File "/opt/anaconda3/envs/sc_env/lib/python3.7/code.py", line 90, in runcode
    exec(code, self.locals)
  File "<input>", line 1, in <module>
  File "/opt/anaconda3/envs/sc_env/lib/python3.7/site-packages/squidpy/pl/_graph.py", line 232, in nhood_enrichment
    **kwargs,
  File "/opt/anaconda3/envs/sc_env/lib/python3.7/site-packages/squidpy/pl/_utils.py", line 549, in _heatmap
    row_order, col_order, _, col_link = _dendrogram(adata.X, method, optimal_ordering=adata.n_obs <= 1500)
  File "/opt/anaconda3/envs/sc_env/lib/python3.7/site-packages/squidpy/pl/_utils.py", line 618, in _dendrogram
    row_link = sch.linkage(data, method=method, **link_kwargs)
  File "/opt/anaconda3/envs/sc_env/lib/python3.7/site-packages/scipy/cluster/hierarchy.py", line 1065, in linkage
    raise ValueError("The condensed distance matrix must contain only "
ValueError: The condensed distance matrix must contain only finite values.

I was wondering do you have any recommendations on how to solve that issue?

Thanks a lot, Frank

frankligy avatar May 24 '22 15:05 frankligy

Hi @frankligy , based on the error, this comes when computing the dendrogram. Does it work without it or do you need it for your analysis? In principle, you could try running

score = adata.uns['cluster_nhood_enrichment']['zscore']
adata.uns['cluster_nhood_enrichment']['zscore'] = np.nan_to_num(score)

though I am not sure what values could be used for imputation not to skew the visualization.

michalk8 avatar May 29 '22 22:05 michalk8

Hi @michalk8, thanks a lot for the reply! yes that's what I did for now by just converting the "nan" to a valid value, and I was wondering the same thing, which value I should use for imputing, would that be zero, or mean/median value, etc? Because I think it is inevitable to get "nan" when computing the enrichment score in this way, as the standard deviation maybe zero in the permutation, so I want to bring it up to see if there are any ideas around it.

frankligy avatar May 30 '22 03:05 frankligy

Hi @frankligy ,

I think you could set them to 0, what's the cluster composition of your dataset (e.g. how many obs per clusters) ?

giovp avatar Jun 07 '22 18:06 giovp

will close this due to inactivity, please reopen if needed

giovp avatar Oct 18 '22 12:10 giovp