mudata icon indicating copy to clipboard operation
mudata copied to clipboard

Adding modality to MuData.mod

Open racng opened this issue 2 years ago • 1 comments

Is your feature request related to a problem? Please describe. After loading a CITE-seq 10x h5 file with muon and 10x vdj file with scirpy, I tried adding the AIRR modality to the existing mdata by adding it to mdata.mod. It seemed to work, since mdata shows that it has 3 modalities. However, when I tried to write the mdata, I get an error:

SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

However, there is no error if i initialize a new mdata object with all three modalities and save that.

Describe the solution you'd like Would be nice if we can add a modality with mdata.mod['airr'] = adata.copy() without causing warnings.

Describe alternatives you've considered We currently need to create a new object whenever we want to add a new modality

new_mdata = mu.MuData({
	'rna': mdata.mod['rna'].copy(),
	'new': adata.copy(),
	'prot': mdata.mod['prot'].copy()
})

Additional context

mdata = mu.read_10x_h5(gex_path)
adata = ir.io.read_10x_vdj(vdj_path)
mdata.mod['airr'] = adata.copy()
mdata.write('test.h5mu')

Full error message when saving the mdata after adding adata to mdata.mod

/users/rng/mambaforge/envs/compbio/lib/python3.10/site-packages/anndata/_core/anndata.py:1230: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df[key] = c
/users/rng/mambaforge/envs/compbio/lib/python3.10/site-packages/anndata/_core/anndata.py:1230: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df[key] = c
/users/rng/mambaforge/envs/compbio/lib/python3.10/site-packages/anndata/_core/anndata.py:1230: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df[key] = c

racng avatar Aug 01 '23 01:08 racng

Thank you, @racng!

As this seems to be a warning, it shouldn't generally stand in the way of adding modalities.

So far I see that this warning might be related to a scenario when the feature names are duplicated across modalities. In that case .varmap also looks fragmented (e.g. array([1, 0, 2, 0, 3, 0, ...]) and array([0, 1, 0, 2, 0, 3, ...]) for two modalities with the same var_names). This is not the case when creating a MuData object from modalities with these duplicated feature names straight away.

With no name duplicates, there should be no problem like this and no warning!


Version 0.3 of mudata will come with a fix to this warning — together with improved name duplicates handling so that varmap looks better and the behaviour is more intuitive when adding modalities. 🎉

gtca avatar Sep 11 '23 11:09 gtca

This should be warning-free in v0.3 but please feel free to open a new issue if there's something else we can improve.

gtca avatar Jul 03 '24 08:07 gtca