BERMUDA icon indicating copy to clipboard operation
BERMUDA copied to clipboard

help in running BERMUDA

Open nmalwinka opened this issue 5 years ago • 5 comments

Hi, would you be able to add example script how to connect pre-processing in R and follow up with autoencoder in Python please?

nmalwinka avatar May 23 '19 12:05 nmalwinka

Hi,

We used two packages in R and saved the results as .csv file in order to run BERMUDA. You could follow the preprocessing steps in BERMUDA/R/pre_processing.R First, we used Seurat to find highly variable genes and cluster cells for each batch (e.g. BERMUDA/pancreas/muraro_seurat.csv). Then, we used MetaNeighbor to generate a similarity matrix between clusters of different batches (e.g. BERMUDA/pancreas/pancreas_metaneighbor.csv). Once you have the required .csv files, you could run BERMUDA directly (e.g. BERMUDA/main_pancreas.py). Hope this is helpful.

Best, Tongxin

txWang avatar May 25 '19 04:05 txWang

hi again, my dataset is quite big and I run out of memory, getting error: Error: cannot allocate vector of size 656.7 Gb Execution halted the metaneighbor package from Maggie Crow has some updated code to avoid vectorising (https://github.com/gillislab/MetaNeighbor/blob/master/R/MetaNeighborUS.R see MetaNeighborUSLowMem). Have you tried to upgrade your code to allow bigger datasets to run using Bermuda?

nmalwinka avatar Jun 24 '19 12:06 nmalwinka

I managed to figure it out by myself. I have a problem with result though. After loading code_list and producing code I expected it to be the same array size as data but it isn't:

>>> code.shape
(51687, 20)
>>> data.shape
(51687, 2583)

There are the same number of cells, but I have only 20 genes(?) there instead of 2583 variable genes.

Further question is how to transform this back to Seurat object? Many thanks

nmalwinka avatar Jun 27 '19 08:06 nmalwinka

Hi,

Thank you for your question. Similar to many batch correction methods, BERMUDA removes batch effects by projecting the original data to a low dimensional space (dimensionality equals to 20 here). The low dimensional code does not suffer from batch effects and can be used for further analysis such as visualization. Currently, we do not support the transformation between our results and Seurat objects.

Best, Tongxin

txWang avatar Jul 12 '19 09:07 txWang

Hi,

I am quite attracted by your BERMUDA work, but I have a problem in running the "pre_processing.R" in the BERMUDA/R folder. I am wondering if you could provide the two datasets, namely "muraro_human.csv" and "baron_human.csv", which are required in the code "pre_processing.R". Thank you in advance.

yzcv avatar Nov 12 '19 14:11 yzcv