GLUE icon indicating copy to clipboard operation
GLUE copied to clipboard

error in calculating biadjacency_matrix

Open dnyansagar opened this issue 1 year ago • 3 comments

Hi I am following pipeline from the GLUE publication to get the regulatory inference. In that attempt, I am at /GLUE/tree/master/experiments/RegInf)/s03_peak_gene_validation.py However I get following error

========================================================== Clustering metacells...

ValueError Traceback (most recent call last) /tmp/ipykernel_3936564/4102559450.py in 1 corr = biadjacency_matrix( 2 utils.metacell_corr( ----> 3 rna, atac, "X_UMAP", n_meta=200, skeleton=dist_graph, method="spr" 4 ), genes.index, peaks.index, weight="corr", dtype=np.float32 5 )

~/utils.py in metacell_corr(rna, atac, use_rep, n_meta, skeleton, method) 26 def metacell_corr(rna, atac, use_rep, n_meta=200, skeleton=None, method="spr"): 27 print("Clustering metacells...") ---> 28 rna_agg, atac_agg = get_metacells_paired(rna, atac, use_rep, n_meta=n_meta) 29 print("Computing correlation...") 30 return _metacell_corr(rna_agg, atac_agg, skeleton=skeleton, method=method)

~/utils.py in get_metacells_paired(rna, atac, use_rep, n_meta) 15 kmeans.train(rna.obsm[use_rep]) 16 _, rna.obs["metacell"] = kmeans.index.search(rna.obsm[use_rep], 1) ---> 17 atac.obs["metacell"] = rna.obs["metacell"].to_numpy() 18 rna_agg = scglue.data.aggregate_obs(rna, "metacell") 19 atac_agg = scglue.data.aggregate_obs(atac, "metacell")

~/miniconda3/envs/mypython3/lib/python3.7/site-packages/pandas/core/frame.py in setitem(self, key, value) 3610 else: 3611 # set column -> 3612 self._set_item(key, value) 3613 3614 def _setitem_slice(self, key: slice, value):

~/miniconda3/envs/mypython3/lib/python3.7/site-packages/pandas/core/frame.py in _set_item(self, key, value) 3782 ensure homogeneity. 3783 """ -> 3784 value = self._sanitize_column(value) 3785 3786 if (

~/miniconda3/envs/mypython3/lib/python3.7/site-packages/pandas/core/frame.py in _sanitize_column(self, value) 4507 4508 if is_list_like(value): -> 4509 com.require_length_match(value, self.index) 4510 return sanitize_array(value, self.index, copy=True, allow_2d=True) 4511

~/miniconda3/envs/mypython3/lib/python3.7/site-packages/pandas/core/common.py in require_length_match(data, index) 530 if len(data) != len(index): 531 raise ValueError( --> 532 "Length of values " 533 f"({len(data)}) " 534 "does not match length of index "

ValueError: Length of values (80789) does not match length of index (73872)

Does 'rna' and 'atac' use_rep need to be the same size?

Can someone please explain the error

dnyansagar avatar Jul 12 '22 14:07 dnyansagar

Thanks for the report!

The scripts in /GLUE/tree/master/experiments/RegInf include comparisons with other regulatory inference methods which are unnecessary for end users, and cannot be run as is. In this case the corr = biadjacency_matrix(...)) computes correlation-based regulatory inference as one of these comparisons. Here we were using paired RNA and ATAC data so the utils.metacell_corr function only works with paired multi-omics profiles. The data you are working on seems to be unpaired (cell number differs across RNA and ATAC). That's why it's throwing this error.

A dedicated regulatory inference tutorial with no unnecessary code is still in the works (see #15). It should be ready by the end of this month. Sorry for the inconvenience. I will let you know then.

Jeff1995 avatar Jul 14 '22 16:07 Jeff1995

Thank you Jeff for your reply. Please let me know when the dedicated tutorial is ready. I look forward to it. In the meantime I plan to create subset of bigger object to match the size of the smaller object for integration. Do you think this approach will work or should be theoretically done?

dnyansagar avatar Jul 20 '22 08:07 dnyansagar

Of course! I'll let you know.

For the second question, the GLUE training process ensures that smaller objects is upsampled to match bigger objects, so I think there is no need to subset the bigger object.

Jeff1995 avatar Jul 21 '22 07:07 Jeff1995

Thank you Jeff for new tutorial.

dnyansagar avatar Aug 18 '22 12:08 dnyansagar

You are welcome! The new tutorial requires an updated version (v0.3.0) of scglue. I've released it on PyPI, but still having some problems with the bioconda update. Hopefully I can get it working in a few days.

Let me know if you have further problems with the new tutorial though :)

Jeff1995 avatar Aug 20 '22 02:08 Jeff1995