scanpy icon indicating copy to clipboard operation
scanpy copied to clipboard

Illegal slicing argument for scalar dataspace when attempting to read 10x_h5 with version 1.9.0

Open nadavyayon opened this issue 2 years ago • 4 comments

Hi

When attempting so simply read a h5 file with:

Python version - 3.8.8
# results_file = path to 10X h5 file 
# adata = sc.read_10x_h5(results_file)

I get the following error which is fixed when rolling back to scanpy=1.8.2

ValueError                                Traceback (most recent call last)
<ipython-input-3-8ddd0a13aab2> in <module>
      8     print(results_file)
----> 9     adata = sc.read_10x_h5(results_file)
     10     adata.var_names_make_unique()
     11     adata.obs.index = meta.iloc[idx,2] + '-' + adata.obs.index

/opt/conda/lib/python3.8/site-packages/scanpy/readwrite.py in read_10x_h5(filename, genome, gex_only, backup_url)
    181         v3 = '/matrix' in f
    182     if v3:
--> 183         adata = _read_v3_10x_h5(filename, start=start)
    184         if genome:
    185             if genome not in adata.var['genome'].values:

/opt/conda/lib/python3.8/site-packages/scanpy/readwrite.py in _read_v3_10x_h5(filename, start)
    266         try:
    267             dsets = {}
--> 268             _collect_datasets(dsets, f["matrix"])
    269 
    270             from scipy.sparse import csr_matrix

/opt/conda/lib/python3.8/site-packages/scanpy/readwrite.py in _collect_datasets(dsets, group)
    254     for k, v in group.items():
    255         if isinstance(v, h5py.Dataset):
--> 256             dsets[k] = v[:]
    257         else:
    258             _collect_datasets(dsets, v)

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

/opt/conda/lib/python3.8/site-packages/h5py/_hl/dataset.py in __getitem__(self, args, new_dtype)
    767         if self.shape == ():
    768             fspace = self.id.get_space()
--> 769             selection = sel2.select_read(fspace, args)
    770             if selection.mshape is None:
    771                 arr = numpy.ndarray((), dtype=new_dtype)

/opt/conda/lib/python3.8/site-packages/h5py/_hl/selections2.py in select_read(fspace, args)
     99     """
    100     if fspace.shape == ():
--> 101         return ScalarReadSelection(fspace, args)
    102 
    103     raise NotImplementedError()

/opt/conda/lib/python3.8/site-packages/h5py/_hl/selections2.py in __init__(self, fspace, args)
     84             self.mshape = ()
     85         else:
---> 86             raise ValueError("Illegal slicing argument for scalar dataspace")
     87 
     88         self.mspace = h5s.create(h5s.SCALAR)

ValueError: Illegal slicing argument for scalar dataspace

Thanks!!

Nadav

nadavyayon avatar Apr 02 '22 17:04 nadavyayon

Can you share the output of sc.logging.print_versions() in the environment that's causing you problems?

I'm unable to reproduce with recent cellranger outputs.

ivirshup avatar Apr 04 '22 13:04 ivirshup

Hey sorry for the delay:

-----
anndata     0.7.5
scanpy      1.9.0
-----
PIL                 8.1.2
anyio               NA
attr                20.3.0
babel               2.9.0
backcall            0.2.0
brotli              NA
cairo               1.20.0
certifi             2020.12.05
cffi                1.14.5
chardet             4.0.0
cloudpickle         1.6.0
colorama            0.4.4
cycler              0.10.0
cython_runtime      NA
cytoolz             0.11.0
dask                2021.03.1
dateutil            2.8.1
decorator           4.4.2
fsspec              0.8.7
google              NA
h5py                3.1.0
idna                2.10
igraph              0.8.3
ipykernel           5.5.0
ipython_genutils    0.2.0
jedi                0.18.0
jinja2              2.11.3
joblib              1.0.1
json5               NA
jsonschema          3.2.0
jupyter_server      1.4.1
jupyterlab_server   2.3.0
kiwisolver          1.3.1
leidenalg           0.8.3
llvmlite            0.34.0
louvain             0.7.0
markupsafe          1.1.1
matplotlib          3.3.4
mpl_toolkits        NA
natsort             7.1.1
nbclassic           NA
nbformat            5.1.2
numba               0.51.2
numpy               1.20.1
packaging           20.9
pandas              1.2.3
parso               0.8.1
pexpect             4.8.0
pickleshare         0.7.5
pkg_resources       NA
prometheus_client   NA
prompt_toolkit      3.0.16
psutil              5.8.0
ptyprocess          0.7.0
pvectorc            NA
pyarrow             0.16.0
pygments            2.8.0
pyparsing           2.4.7
pyrsistent          NA
pytoml              NA
pytz                2021.1
requests            2.25.1
ruamel              NA
scipy               1.6.1
send2trash          NA
session_info        1.0.0
setuptools_scm      NA
six                 1.15.0
sklearn             0.24.1
sniffio             1.2.0
socks               1.7.1
sphinxcontrib       NA
storemagic          NA
tblib               1.7.0
texttable           1.6.3
tlz                 0.11.0
toolz               0.11.1
tornado             6.1
traitlets           5.0.5
typing_extensions   NA
urllib3             1.26.3
wcwidth             0.2.5
yaml                5.3.1
zmq                 22.0.3
-----
IPython             7.21.0
jupyter_client      6.1.11
jupyter_core        4.7.1
jupyterlab          3.0.9
notebook            6.2.0
-----
Python 3.8.8 | packaged by conda-forge | (default, Feb 20 2021, 16:22:27) [GCC 9.3.0]
Linux-4.15.0-112-generic-x86_64-with-glibc2.10
-----
Session information updated at 2022-04-08 14:58

nadavyayon avatar Apr 08 '22 14:04 nadavyayon

I got similar error when I was trying to use .h5 file from cellbender output. I have multiome data.

`>>> adata = scanpy.read_10x_h5("/sc/arion/projects/hmDNAmap/snHeroin/analysis/ARC_TD005235-354/outs/cellbender/cb_feature_bc_matrix_filtered.h5", gex_only=False)`
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/sc/arion/work/gujarh01/software/anaconda3/lib/python3.9/site-packages/scanpy/readwrite.py", line 183, in read_10x_h5
    adata = _read_v3_10x_h5(filename, start=start)
  File "/sc/arion/work/gujarh01/software/anaconda3/lib/python3.9/site-packages/scanpy/readwrite.py", line 268, in _read_v3_10x_h5
    _collect_datasets(dsets, f["matrix"])
  File "/sc/arion/work/gujarh01/software/anaconda3/lib/python3.9/site-packages/scanpy/readwrite.py", line 256, in _collect_datasets
    dsets[k] = v[:]
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "/sc/arion/work/gujarh01/software/anaconda3/lib/python3.9/site-packages/h5py/_hl/dataset.py", line 738, in __getitem__
    selection = sel2.select_read(fspace, args)
  File "/sc/arion/work/gujarh01/software/anaconda3/lib/python3.9/site-packages/h5py/_hl/selections2.py", line 101, in select_read
    return ScalarReadSelection(fspace, args)
  File "/sc/arion/work/gujarh01/software/anaconda3/lib/python3.9/site-packages/h5py/_hl/selections2.py", line 86, in __init__
    raise ValueError("Illegal slicing argument for scalar dataspace")

> **ValueError: Illegal slicing argument for scalar dataspace**

>>> scanpy.logging.print_versions()

anndata 0.8.0 scanpy 1.9.1

PIL 8.4.0 beta_ufunc NA binom_ufunc NA bottleneck 1.3.2 cffi 1.14.6 cloudpickle 2.0.0 colorama 0.4.4 concurrent NA cycler 0.10.0 cython_runtime NA cytoolz 0.11.0 dask 2021.10.0 dateutil 2.8.2 defusedxml 0.7.1 encodings NA fsspec 2021.08.1 genericpath NA h5py 3.3.0 igraph 0.9.6 jinja2 2.11.3 joblib 1.1.0 kiwisolver 1.3.1 leidenalg 0.8.7 llvmlite 0.37.0 markupsafe 1.1.1 matplotlib 3.4.3 mkl 2.4.0 mpl_toolkits NA natsort 7.1.1 nbinom_ufunc NA ntpath NA numba 0.54.1 numexpr 2.7.3 numpy 1.20.3 opcode NA packaging 21.0 pandas 1.3.4 pkg_resources NA posixpath NA psutil 5.8.0 pyexpat NA pyparsing 3.0.4 pytz 2021.3 scipy 1.7.1 scrublet NA session_info 1.0.0 six 1.16.0 sklearn 0.24.2 sphinxcontrib NA sre_compile NA sre_constants NA sre_parse NA tblib 1.7.0 texttable 1.6.4 tlz 0.11.0 toolz 0.11.1 typing_extensions NA wcwidth 0.2.5 yaml 6.0 zope NA

Python 3.9.7 (default, Sep 16 2021, 13:09:58) [GCC 7.5.0] Linux-3.10.0-957.10.1.el7.x86_64-x86_64-with-glibc2.17

Session information updated at 2022-05-17 14:56

beetlejuice007 avatar May 17 '22 18:05 beetlejuice007

I'm getting the same error using the CellBender tutorial output. Attaching the file to make it easier to reproduce.

tiny_10x_pbmc_filtered.h5.zip

sc.logging.print_versions()

-----
anndata     0.7.8
scanpy      1.9.1
-----
PIL                 9.0.1
asttokens           NA
backcall            0.2.0
beta_ufunc          NA
binom_ufunc         NA
cffi                1.15.0
cycler              0.10.0
cython_runtime      NA
dateutil            2.8.2
debugpy             1.6.0
decorator           5.1.1
defusedxml          0.7.1
doubletdetection    4.2
entrypoints         0.4
executing           0.8.3
google              NA
h5py                3.6.0
hypergeom_ufunc     NA
igraph              0.9.9
ipykernel           6.10.0
ipython_genutils    0.2.0
ipywidgets          7.7.0
jedi                0.18.1
joblib              1.1.0
kiwisolver          1.4.2
leidenalg           0.8.9
llvmlite            0.38.0
louvain             0.7.1
matplotlib          3.5.1
matplotlib_inline   NA
mkl                 2.4.0
mpl_toolkits        NA
mudata              0.1.1
muon                0.1.2
natsort             8.1.0
nbinom_ufunc        NA
numba               0.55.1
numexpr             2.8.1
numpy               1.21.2
organize_metadata   NA
packaging           21.3
pandas              1.4.1
parso               0.8.3
pexpect             4.8.0
phenograph          1.5.7
pickleshare         0.7.5
pkg_resources       NA
prompt_toolkit      3.0.28
psutil              5.9.0
ptyprocess          0.7.0
pure_eval           0.2.2
pycparser           2.21
pydev_ipython       NA
pydevconsole        NA
pydevd              2.8.0
pydevd_file_utils   NA
pydevd_plugins      NA
pydevd_tracing      NA
pygments            2.11.2
pynndescent         0.5.6
pyparsing           3.0.7
pytz                2022.1
scikits             NA
scipy               1.8.0
seaborn             0.11.2
session_info        1.0.0
setuptools          62.0.0
setuptools_scm      NA
six                 1.16.0
sklearn             1.0.2
stack_data          0.2.0
statsmodels         0.13.2
tables              3.7.0
texttable           1.6.4
threadpoolctl       3.1.0
tornado             6.1
tqdm                4.63.1
traitlets           5.1.1
typing_extensions   NA
umap                0.5.2
wcwidth             0.2.5
yaml                6.0
zipp                NA
zmq                 22.3.0
-----
IPython             8.2.0
jupyter_client      7.1.2
jupyter_core        4.9.2
notebook            6.4.10
-----
Python 3.9.11 (main, Mar 28 2022, 10:10:35) [GCC 7.5.0]
Linux-4.15.0-142-generic-x86_64-with-glibc2.27
-----
Session information updated at 2022-05-24 15:05

erankotler avatar May 24 '22 22:05 erankotler

Was this fixed by https://github.com/scverse/scanpy/pull/2344 ? Edit: Yes

chris-rands avatar Oct 10 '22 10:10 chris-rands