muon icon indicating copy to clipboard operation
muon copied to clipboard

save data on disk error

Open wangjiawen2013 opened this issue 2 years ago • 4 comments

Hi, When I saved multimodal data on disk following https://muon-tutorials.readthedocs.io/en/latest/single-cell-rna-atac/pbmc10k/1-Gene-Expression-Processing.html, an error occurred:

TypeError: No method has been defined for writing <class 'collections.OrderedDict'> elements to <class 'h5py._hl.group.Group'> Above error raised while writing key 'atac' of <class 'h5py._hl.group.Group'> to /

But I can see that the object had already on the disk, and then I load it with: mdata = mu.read("data/pbmc10k.h5mu"), it warned: /home/wangjw/programs/miniconda3/envs/scenic/lib/python3.7/site-packages/mudata/_core/io.py:366: UserWarning: The HDF5 file was not created by muon, we can't guarantee that everything will work correctly "The HDF5 file was not created by muon, we can't guarantee that everything will work correctly" what'w wrong with these ?

wangjiawen2013 avatar Sep 26 '22 09:09 wangjiawen2013

Hi, I also encountered with the same problem and loaded the h5mu, the .uns under atac assay was not there.

PrachTecha avatar Sep 27 '22 18:09 PrachTecha

I am also getting this same error. It would be nice if the tutorial included the information about the version numbers of the dependencies used...

sc.logging.print_header()

megadesk avatar Oct 03 '22 19:10 megadesk

Hey everyone, I believe this the same issue as https://github.com/scverse/muon/issues/65.

I've removed the last parts where OrderedDict was still used in muon in 03a3ebc so the current master should not make more OrderedDicts. Serialisation is based on AnnData however and for that please track the issue https://github.com/scverse/anndata/issues/796. Until it is fixed in AnnData >0.8, OrderedDicts can't be serialised, thankfully the fix is to just convert them to default dicts:

mdata.uns["atac"]["my_dict"] = dict(mdata.uns["atac"]["my_dict"])

gtca avatar Oct 03 '22 20:10 gtca

thank you ! for the tutorial this fixed the problem for me

mdata.mod['atac'].uns["files"]=dict(mdata.mod['atac'].uns["files"])
mdata.mod['atac'].uns["atac"]=dict(mdata.mod['atac'].uns["atac"])
mdata.write("data/pbmc10k.h5mu")

also the leiden clustering was slightly different (16 instead of 17 and different order), so I had to change the celltype assignment dictionary...

####### edited new_cluster_names to match tutorial clusters
new_cluster_names = {
    "0": "CD4+ memory T", "1": "CD14 mono", "2": "CD4+ naïve T", "3": "CD8+ naïve T",
    "4": "intermediate mono","5": "CD8+ activated T", "6": "memory B","7": "NK",
    "8": "CD16 mono",  "10": "naïve B","11": "mDC",
      "13": "pDC","14": "MAIT",
}

scanpy==1.8.2 anndata==0.8.0 umap==0.5.3 numpy==1.23.3 scipy==1.9.1 pandas==1.5.0 scikit-learn==1.0.2 statsmodels==0.13.2 python-igraph==0.9.11 louvain==0.7.1 pynndescent==0.5.7

megadesk avatar Oct 03 '22 20:10 megadesk