scanpy icon indicating copy to clipboard operation
scanpy copied to clipboard

ValueError at sc.pp.highly_variable_genes() using seurat_v3 flavor.

Open lijinbio opened this issue 3 years ago • 1 comments

  • [x] I have checked that this issue has not already been reported.
  • [x] I have confirmed this bug exists on the latest version of scanpy.
  • [ ] (optional) I have confirmed this bug exists on the master branch of scanpy.

The seurat_v3 flavor for HVGs can not run on some inputs. See below.

import scanpy as sc
x=sc.read('merges2.h5ad', backup_url='https://ndownloader.figshare.com/files/27854286?private_link=8cd07dabcde5a773defd')
x.var_names_make_unique()
print(x)
sc.pp.highly_variable_genes(x, flavor='seurat_v3', n_top_genes=50, batch_key='sampleid', subset=True)
AnnData object with n_obs × n_vars = 600 × 32838
    obs: 'orig.ident', 'nCount_RNA', 'nFeature_RNA', 'sampleid'
    var: 'features'
Traceback (most recent call last):
  File "./main.py", line 8, in <module>
    sc.pp.highly_variable_genes(x, flavor='seurat_v3', n_top_genes=50, batch_key='sampleid', subset=True)
  File "/usr/local/lib/python3.8/site-packages/scanpy/preprocessing/_highly_variable_genes.py", line 419, in highly_variable_genes
    return _highly_variable_genes_seurat_v3(
  File "/usr/local/lib/python3.8/site-packages/scanpy/preprocessing/_highly_variable_genes.py", line 85, in _highly_variable_genes_seurat_v3
    model.fit()
  File "_loess.pyx", line 899, in _loess.loess.fit
ValueError: b'reciprocal condition number  3.9554e-16\n'

Versions


anndata 0.7.5 scanpy 1.7.2 sinfo 0.3.1

PIL 8.1.0 anndata 0.7.5 cffi 1.14.4 colorama 0.4.4 cycler 0.10.0 cython_runtime NA dateutil 2.8.1 get_version 2.1 google NA h5py 2.10.0 igraph 0.8.3 joblib 1.0.0 kiwisolver 1.3.1 legacy_api_wrap 1.2 leidenalg 0.8.3 llvmlite 0.35.0 louvain 0.6.1 matplotlib 3.3.3 mpl_toolkits NA natsort 7.1.0 numba 0.52.0 numexpr 2.7.2 numpy 1.18.1 packaging 20.8 pandas 1.0.1 pkg_resources NA psutil 5.8.0 pyparsing 2.4.7 pytz 2020.1 scanpy 1.7.2 scipy 1.4.1 setuptools_scm NA sinfo 0.3.1 sitecustomize NA six 1.15.0 sklearn 0.22.2.post1 tables 3.6.1 texttable 1.6.3 typing_extensions NA wcwidth 0.2.5 yaml 5.3.1

Python 3.8.8 (default, Feb 26 2021, 23:59:43) [Clang 12.0.0 (clang-1200.0.32.29)] macOS-10.15.7-x86_64-i386-64bit 4 logical CPU cores, i386

Session information updated at 2021-05-03 11:41

lijinbio avatar May 03 '21 16:05 lijinbio

I encounter this same problem, if I call sc.pp.log1p(x) before culate hvg with seurat3, the error is gone, it have correlation with the adata.X is sparse or dense in my view.

yuxiaokang-source avatar May 28 '22 02:05 yuxiaokang-source

Hi! This is answered here: https://github.com/scverse/scanpy/issues/2669

TLDR: set a higher span, when calculating HVGs. 0.5 worked for me. For example,

sc.pp.highly_variable_genes(x, flavor='seurat_v3', n_top_genes=50, batch_key='sampleid', subset=True, span=0.5)

VladimirShitov avatar Oct 18 '23 12:10 VladimirShitov

@ivirshup , this issue can be closed both as solved and as a duplicate. Also, please check my comment in https://github.com/scverse/scanpy/issues/2669

VladimirShitov avatar Oct 18 '23 12:10 VladimirShitov