anndata "ValueError: value.index does not match parent’s axis 0 names" error when trying to read h5ad that was processed through scVI

Hi !

I am new to the Scanpy/Anndata ecosystem, and trying to use scVI for data integration.

I trained a scVI model on an Anndata object, saved the object as h5ad, and now I get an error when trying to read the file.

The error occurs when using scanpy.read_h5ad or scvi.data.read_h5ad, and I found someone else reporting the error in this repo before , so I am posting the issue here. Kindly let me know if I should post this somewhere else.

adipo_all = scvi.data.read_h5ad("/home/yyyyy/analysis/anndata_working/adipo_sn_01112021_trained_v1.h5ad")

/home/xxxxx/pyenv_custom/py_scanalysis_env/lib/python3.7/site-packages/anndata/_core/anndata.py:120: ImplicitModificationWarning: Transforming to str index.
  warnings.warn("Transforming to str index.", ImplicitModificationWarning)

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/scratch/33545574/ipykernel_5065/2292619582.py in <module>
----> 1 adipo_all = scvi.data.read_h5ad("/home/yyyyy/analysis/anndata_working/adipo_sn_01112021_trained_v1.h5ad")

/home/xxxxx/pyenv_custom/py_scanalysis_env/lib/python3.7/site-packages/anndata/_io/h5ad.py in read_h5ad(filename, backed, as_sparse, as_sparse_fmt, chunk_size)
    435     _clean_uns(d)  # backwards compat
    436 
--> 437     return AnnData(**d)
    438 
    439 

/home/xxxxx/pyenv_custom/py_scanalysis_env/lib/python3.7/site-packages/anndata/_core/anndata.py in __init__(self, X, obs, var, uns, obsm, varm, layers, raw, dtype, shape, filename, filemode, asview, obsp, varp, oidx, vidx)
    320                 varp=varp,
    321                 filename=filename,
--> 322                 filemode=filemode,
    323             )
    324 

/home/xxxxx/pyenv_custom/py_scanalysis_env/lib/python3.7/site-packages/anndata/_core/anndata.py in _init_as_actual(self, X, obs, var, uns, obsm, varm, varp, obsp, raw, layers, dtype, shape, filename, filemode)
    508 
    509         # TODO: Think about consequences of making obsm a group in hdf
--> 510         self._obsm = AxisArrays(self, 0, vals=convert_to_dict(obsm))
    511         self._varm = AxisArrays(self, 1, vals=convert_to_dict(varm))
    512 

/home/xxxxx/pyenv_custom/py_scanalysis_env/lib/python3.7/site-packages/anndata/_core/aligned_mapping.py in __init__(self, parent, axis, vals)
    233         self._data = dict()
    234         if vals is not None:
--> 235             self.update(vals)
    236 
    237 

/home/xxxxx/pyenv_custom/py_scanalysis_env/lib/python3.7/_collections_abc.py in update(*args, **kwds)
   Supraclavicular 839             if isinstance(other, Mapping):
    840                 for key in other:
--> 841                     self[key] = other[key]
    842             elif hasattr(other, "keys"):
    843                 for key in other.keys():

/home/xxxxx/pyenv_custom/py_scanalysis_env/lib/python3.7/site-packages/anndata/_core/aligned_mapping.py in __setitem__(self, key, value)
    149 
    150     def __setitem__(self, key: str, value: V):
--> 151         value = self._validate_value(value, key)
    152         self._data[key] = value
    153 

/home/xxxxx/pyenv_custom/py_scanalysis_env/lib/python3.7/site-packages/anndata/_core/aligned_mapping.py in _validate_value(self, val, key)
    211             # Could probably also re-order index if it’s contained
    212             raise ValueError(
--> 213                 f"value.index does not match parent’s axis {self.axes[0]} names"
    214             )
    215         return super()._validate_value(val, key)

ValueError: value.index does not match parent’s axis 0 names

This is the offending anndata object.

AnnData object with n_obs × n_vars = 123472 × 36795
    obs: 'orig.ident', 'nCount_RNA', 'nFeature_RNA', 'nCount_spliced', 'nFeature_spliced', 'nCount_unspliced', 'nFeature_unspliced', 'nCount_HTO', 'nFeature_HTO', 'HTO_maxID', 'HTO_secondID', 'HTO_margin', 'HTO_classification', 'HTO_classification.global', 'hash.ID', 'nCount_SCT', 'nFeature_SCT', 'batch', 'n_genes', 'annot', 'sample_origin', 'day', 'depot', 'tissue', 'timepoint', 'n_genes_by_counts', 'total_counts', 'total_counts_percent_ribo', 'pct_counts_percent_ribo', '_scvi_batch', '_scvi_labels'
    var: 'features', 'spliced_features', 'unspliced_features', 'n_cells', 'highly_variable', 'highly_variable_rank', 'means', 'variances', 'variances_norm', 'highly_variable_nbatches', 'percent_ribo', 'n_cells_by_counts', 'mean_counts', 'pct_dropout_by_counts', 'total_counts'
    uns: 'hvg', 'sample_origin_colors', 'depot_colors', '_scvi'
    obsm: '_scvi_extra_categoricals', '_scvi_extra_continuous', 'X_scVI'
    layers: 'counts', 'spliced', 'unspliced'

Whereas the previous version below can be read just fine. scanpy.pp.highly_variable_genes and sc.pp.calculate_qc_metrics were run followed by model training in scVI to get the above anndata.

AnnData object with n_obs × n_vars = 123472 × 36795
    obs: 'orig.ident', 'nCount_RNA', 'nFeature_RNA', 'nCount_spliced', 'nFeature_spliced', 'nCount_unspliced', 'nFeature_unspliced', 'nCount_HTO', 'nFeature_HTO', 'HTO_maxID', 'HTO_secondID', 'HTO_margin', 'HTO_classification', 'HTO_classification.global', 'hash.ID', 'nCount_SCT', 'nFeature_SCT', 'batch', 'n_genes', 'annot', 'sample_origin', 'day', 'depot', 'tissue', 'timepoint'
    var: 'features', 'spliced_features', 'unspliced_features', 'n_cells'
    layers: 'counts', 'spliced', 'unspliced'

Libraries: scanpy==1.8.1 anndata==0.7.6 umap==0.5.1 numpy==1.20.3 scipy==1.7.1 pandas==1.3.3 scikit-learn==0.24.2 statsmodels==0.13.0rc0 python-igraph==0.9.6 pynndescent==0.5.4

Nov 01 '21 15:11 nagendraKU

I had a similar error and doing the following resolved the error. Make sure all 'obsm' objects that are dataframes have the same index as the 'obs' dataframe.

May 31 '22 18:05 cinaljess

This issue has been automatically marked as stale because it has not had recent activity. Please add a comment if you want to keep the issue open. Thank you for your contributions!

Jun 23 '23 02:06 github-actions[bot]

Seems like this is solved, and the discussion happens in #311

Please tell us if you need anything

Jun 23 '23 09:06 flying-sheep

anndata anndata copied to clipboard

"ValueError: value.index does not match parent’s axis 0 names" error when trying to read h5ad that was processed through scVI

anndata
anndata copied to clipboard