scvi-tools icon indicating copy to clipboard operation
scvi-tools copied to clipboard

Data loading and object construction

Open ccruizm opened this issue 2 years ago • 2 comments

Good day!

I want to use MultiVI to integrate scRNA, scATAC and multiome (RNA+ATAC). I saw in the tutorial you read the files directly form the cellranger output directory, however, I have independent AnnData objects carrying the info for each modality. I tried to used Muon to create the paired modality but when trying scvi.data.organize_multiome_anndatas, I get an error.

The code I am using is:

adata_rna_multiome = sc.read('data/subset_rna_multi.h5ad')
adata_atac_multiome = sc.read('data/subset_atac_multi.h5ad')

adata_paired = mu.MuData({'rna': adata_rna_multiome, 'atac': adata_atac_multiome})

adata_rna = sc.read('data/subset_rna_only.h5ad')
adata_atac = sc.read('data/subset_atac_only.h5ad')

adata_mvi = scvi.data.organize_multiome_anndatas(adata_paired, adata_rna, adata_atac)

and the error is:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[12], line 2
      1 # We can now use the organizing method from scvi to concatenate these anndata
----> 2 adata_mvi = scvi.data.organize_multiome_anndatas(adata_paired, adata_rna, adata_atac)

File ~/miniconda3/envs/multi_integration/lib/python3.9/site-packages/scvi/data/_preprocessing.py:331, in organize_multiome_anndatas(multi_anndata, rna_anndata, atac_anndata, modality_key)
    328     return multi_anndata.concatenate(other, join="outer", batch_key=modality_key)
    330 if rna_anndata is not None:
--> 331     res_anndata = _concat_anndata(res_anndata, rna_anndata)
    333     modality_ann += ["expression"] * rna_anndata.shape[0]
    334     obs_names += list(rna_anndata.obs.index.values)

File ~/miniconda3/envs/multi_integration/lib/python3.9/site-packages/scvi/data/_preprocessing.py:328, in organize_multiome_anndatas.<locals>._concat_anndata(multi_anndata, other)
    325     raise ValueError("No shared features between Multiome and other AnnData.")
    327 other = other[:, shared_features]
--> 328 return multi_anndata.concatenate(other, join="outer", batch_key=modality_key)

AttributeError: 'MuData' object has no attribute 'concatenate'

How do you think I could construct the paired object starting from the AnnData objects I already have pre-processed? I saw you plan on giving Muon support (https://github.com/scverse/scvi-tools/issues/1935). Is that something that will happen any time soon?

Thanks in advance for your help!

ccruizm avatar May 15 '23 14:05 ccruizm

Hi, thank you for your question. organize_multiome_anndatas expects all the inputs to be AnnDatas and returns AnnDatas as the final object, and one of your inputs is MuData, which is why it's throwing that error. I would recommend constructing the full MuData yourself as we don't have a function for that in scvi yet.

We plan on adding MuData support for MultiVI in v1.1, which won't be out for another couple weeks unfortunately.

martinkim0 avatar May 15 '23 23:05 martinkim0

Thanks @martinkim0 for your speedy reply!

I have run the pipeline but could not get a proper integration of the different modalities. I want to make sure the input of the data is correct. To be sure, for RNA it receives raw counts and for ATAC is binary data? or does the ATAC data need any preprocessing?

ccruizm avatar May 17 '23 07:05 ccruizm

Hi, sorry for following up late. Correct - for RNA it expects raw counts and for ATAC binary. For future reference, we prefer having these types of usage questions in our Discourse forum.

martinkim0 avatar Jul 12 '24 17:07 martinkim0