bracer
bracer copied to clipboard
Memory requirements in Summarise
Hi,
I attempted doing the suggested workflow on 10X Genomics data described on #21 . The program works well up to the assembly stage, but we have problems during the summarise stage.
I ran summarise with relatively common arguments ( -p 8 --graph_format pdf --infer_lineage
), but:
-
changeodb.tab
,IMGT_gapped.tab
,reconstructed_lengths_BCR[H|K|L].pdf
,reconstructed_lengths_BCR[H|K|L].txt
,full_length_seqs.pdf
,changeo_input_[H|K|L]_clone-pass.tab
andisotype_distribution.pdf
were successfully generated nearly immediately. However,clonotype_network_[with|without]_identifiers.dot
took at least 8 hours to generate. Halting the script during this period shows the time was spent runningmake_cell_network_from_dna
, Especially on line 1094-1095 ofbracer_func.py
. - I'm under the impression that this step spent an abnormal amount of memory. When I assigned a Grid Engine job with 8 cores and 176GB of memory, the job quitted due to memory over-usage.
The data set includes 4,275 barcodes, that I don't believe to be particularly many. Are there any thing I can do with it?
Hello bracer team... I am also doing the same procedure described in the prior post. Coerced the 10x data and ran assemble on 3220 barcodes without issues. Now running bracer in a google cloud instance in a docker container, (8 cores 30 gigs of ram). It has been running for 2 days, and during this time I reliably loose SSH connection to my cloud instance and can't regain it. Upon restarting the VM, I get the contents below in the filtered_BCR_summary directory....
suspect the same issue as in this post. Can you give some guidance on how to overcome this, what kind of memory I would need to request (in the cloud) to get to the final output?
Thanks so much! Andrew
total 11M -rw-r--r-- 1 root root 0 Dec 20 13:05 BCR_summary.txt -rw-r--r-- 1 root root 532K Dec 20 13:06 IMGT_gapped_db.tab -rw-r--r-- 1 root root 878K Dec 20 13:06 changeo_input_H.tab -rw-r--r-- 1 root root 884K Dec 20 13:06 changeo_input_H_clone-pass.tab -rw-r--r-- 1 root root 23K Dec 20 13:06 changeo_input_K.tab -rw-r--r-- 1 root root 24K Dec 20 13:06 changeo_input_K_clone-pass.tab -rw-r--r-- 1 root root 1.4M Dec 20 13:06 changeo_input_L.tab -rw-r--r-- 1 root root 1.4M Dec 20 13:06 changeo_input_L_clone-pass.tab -rw-r--r-- 1 root root 5.4M Dec 20 13:06 changeodb.tab -rw-r--r-- 1 root root 0 Dec 20 14:31 clonotype_network_with_identifiers.dot -rw-r--r-- 1 root root 13K Dec 20 13:06 full_length_seqs.pdf -rw-r--r-- 1 root root 12K Dec 20 13:06 isotype_distribution.pdf drwxr-xr-x 2 root root 4.0K Dec 20 13:05 lineage_trees -rw-r--r-- 1 root root 16K Dec 20 13:06 reconstructed_lengths_BCR_H.pdf -rw-r--r-- 1 root root 6.1K Dec 20 13:06 reconstructed_lengths_BCR_H.txt -rw-r--r-- 1 root root 16K Dec 20 13:06 reconstructed_lengths_BCR_K.pdf -rw-r--r-- 1 root root 232 Dec 20 13:06 reconstructed_lengths_BCR_K.txt -rw-r--r-- 1 root root 15K Dec 20 13:06 reconstructed_lengths_BCR_L.pdf -rw-r--r-- 1 root root 12K Dec 20 13:06 reconstructed_lengths_BCR_L.txt
Hi,
Can you try running summarise
with the --no_networks
option? It’s usually the network graph layout that becomes very slow and resource intensive with large numbers of cells.
Mike
I would suggest the same as Mike. Do you expect your cells to be highly clonal? This could explain the huge memory use. Unfortunately we have not tested BraCeR on datasets as large as 10x ones.
Best, Ida