anndata icon indicating copy to clipboard operation
anndata copied to clipboard

"ValueError: value.index does not match parent’s axis 0 names" error when trying to read h5ad that was processed through scVI

Open nagendraKU opened this issue 3 years ago • 1 comments

Hi !

I am new to the Scanpy/Anndata ecosystem, and trying to use scVI for data integration.

I trained a scVI model on an Anndata object, saved the object as h5ad, and now I get an error when trying to read the file.

The error occurs when using scanpy.read_h5ad or scvi.data.read_h5ad, and I found someone else reporting the error in this repo before , so I am posting the issue here. Kindly let me know if I should post this somewhere else.

adipo_all = scvi.data.read_h5ad("/home/yyyyy/analysis/anndata_working/adipo_sn_01112021_trained_v1.h5ad")

/home/xxxxx/pyenv_custom/py_scanalysis_env/lib/python3.7/site-packages/anndata/_core/anndata.py:120: ImplicitModificationWarning: Transforming to str index.
  warnings.warn("Transforming to str index.", ImplicitModificationWarning)

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/scratch/33545574/ipykernel_5065/2292619582.py in <module>
----> 1 adipo_all = scvi.data.read_h5ad("/home/yyyyy/analysis/anndata_working/adipo_sn_01112021_trained_v1.h5ad")

/home/xxxxx/pyenv_custom/py_scanalysis_env/lib/python3.7/site-packages/anndata/_io/h5ad.py in read_h5ad(filename, backed, as_sparse, as_sparse_fmt, chunk_size)
    435     _clean_uns(d)  # backwards compat
    436 
--> 437     return AnnData(**d)
    438 
    439 

/home/xxxxx/pyenv_custom/py_scanalysis_env/lib/python3.7/site-packages/anndata/_core/anndata.py in __init__(self, X, obs, var, uns, obsm, varm, layers, raw, dtype, shape, filename, filemode, asview, obsp, varp, oidx, vidx)
    320                 varp=varp,
    321                 filename=filename,
--> 322                 filemode=filemode,
    323             )
    324 

/home/xxxxx/pyenv_custom/py_scanalysis_env/lib/python3.7/site-packages/anndata/_core/anndata.py in _init_as_actual(self, X, obs, var, uns, obsm, varm, varp, obsp, raw, layers, dtype, shape, filename, filemode)
    508 
    509         # TODO: Think about consequences of making obsm a group in hdf
--> 510         self._obsm = AxisArrays(self, 0, vals=convert_to_dict(obsm))
    511         self._varm = AxisArrays(self, 1, vals=convert_to_dict(varm))
    512 

/home/xxxxx/pyenv_custom/py_scanalysis_env/lib/python3.7/site-packages/anndata/_core/aligned_mapping.py in __init__(self, parent, axis, vals)
    233         self._data = dict()
    234         if vals is not None:
--> 235             self.update(vals)
    236 
    237 

/home/xxxxx/pyenv_custom/py_scanalysis_env/lib/python3.7/_collections_abc.py in update(*args, **kwds)
   Supraclavicular 839             if isinstance(other, Mapping):
    840                 for key in other:
--> 841                     self[key] = other[key]
    842             elif hasattr(other, "keys"):
    843                 for key in other.keys():

/home/xxxxx/pyenv_custom/py_scanalysis_env/lib/python3.7/site-packages/anndata/_core/aligned_mapping.py in __setitem__(self, key, value)
    149 
    150     def __setitem__(self, key: str, value: V):
--> 151         value = self._validate_value(value, key)
    152         self._data[key] = value
    153 

/home/xxxxx/pyenv_custom/py_scanalysis_env/lib/python3.7/site-packages/anndata/_core/aligned_mapping.py in _validate_value(self, val, key)
    211             # Could probably also re-order index if it’s contained
    212             raise ValueError(
--> 213                 f"value.index does not match parent’s axis {self.axes[0]} names"
    214             )
    215         return super()._validate_value(val, key)

ValueError: value.index does not match parent’s axis 0 names

This is the offending anndata object.

AnnData object with n_obs × n_vars = 123472 × 36795
    obs: 'orig.ident', 'nCount_RNA', 'nFeature_RNA', 'nCount_spliced', 'nFeature_spliced', 'nCount_unspliced', 'nFeature_unspliced', 'nCount_HTO', 'nFeature_HTO', 'HTO_maxID', 'HTO_secondID', 'HTO_margin', 'HTO_classification', 'HTO_classification.global', 'hash.ID', 'nCount_SCT', 'nFeature_SCT', 'batch', 'n_genes', 'annot', 'sample_origin', 'day', 'depot', 'tissue', 'timepoint', 'n_genes_by_counts', 'total_counts', 'total_counts_percent_ribo', 'pct_counts_percent_ribo', '_scvi_batch', '_scvi_labels'
    var: 'features', 'spliced_features', 'unspliced_features', 'n_cells', 'highly_variable', 'highly_variable_rank', 'means', 'variances', 'variances_norm', 'highly_variable_nbatches', 'percent_ribo', 'n_cells_by_counts', 'mean_counts', 'pct_dropout_by_counts', 'total_counts'
    uns: 'hvg', 'sample_origin_colors', 'depot_colors', '_scvi'
    obsm: '_scvi_extra_categoricals', '_scvi_extra_continuous', 'X_scVI'
    layers: 'counts', 'spliced', 'unspliced'
    

Whereas the previous version below can be read just fine. scanpy.pp.highly_variable_genes and sc.pp.calculate_qc_metrics were run followed by model training in scVI to get the above anndata.

AnnData object with n_obs × n_vars = 123472 × 36795
    obs: 'orig.ident', 'nCount_RNA', 'nFeature_RNA', 'nCount_spliced', 'nFeature_spliced', 'nCount_unspliced', 'nFeature_unspliced', 'nCount_HTO', 'nFeature_HTO', 'HTO_maxID', 'HTO_secondID', 'HTO_margin', 'HTO_classification', 'HTO_classification.global', 'hash.ID', 'nCount_SCT', 'nFeature_SCT', 'batch', 'n_genes', 'annot', 'sample_origin', 'day', 'depot', 'tissue', 'timepoint'
    var: 'features', 'spliced_features', 'unspliced_features', 'n_cells'
    layers: 'counts', 'spliced', 'unspliced'

Libraries: scanpy==1.8.1 anndata==0.7.6 umap==0.5.1 numpy==1.20.3 scipy==1.7.1 pandas==1.3.3 scikit-learn==0.24.2 statsmodels==0.13.0rc0 python-igraph==0.9.6 pynndescent==0.5.4

nagendraKU avatar Nov 01 '21 15:11 nagendraKU

I had a similar error and doing the following resolved the error. Make sure all 'obsm' objects that are dataframes have the same index as the 'obs' dataframe.

cinaljess avatar May 31 '22 18:05 cinaljess

This issue has been automatically marked as stale because it has not had recent activity. Please add a comment if you want to keep the issue open. Thank you for your contributions!

github-actions[bot] avatar Jun 23 '23 02:06 github-actions[bot]

Seems like this is solved, and the discussion happens in #311

Please tell us if you need anything

flying-sheep avatar Jun 23 '23 09:06 flying-sheep