gemBS icon indicating copy to clipboard operation
gemBS copied to clipboard

Slowness when running

Open paulstretenowich opened this issue 5 years ago • 2 comments

Hi, I have installed gemBS but when I try to run it it is very slow when I try on the working example. For instance, the gemBS prepare -c example.conf -t example.csv command runs for several minutes and the mapping takes several hours. I am on a Lustre file system when I am experimenting such behaviour, it seems like a deadlock happening perhaps because of sqlite? Thanks for the help, Paul

paulstretenowich avatar May 08 '19 16:05 paulstretenowich

Hi Paul,

Our cluster has a Lustre FS for bulk storage and a small shared NFS filesystem for home directories. I tend to run gemBS from the NFS filesystem, but with all of the data (input and output) being on Lustre with only the sqlite db on NFS. You can also try running with the --no-db option to gemBS prepare if you suspect that this is an interaction between sqlite and Lustre. In this case gemBS will not use an on disk database at all. If this option is used then you have to be careful running multiple gemBS jobs in parallel.

Hope this helps, Simon

On 8 May 2019, at 18:02, Paul STRETENOWICH [email protected] wrote:

Hi, I have installed gemBS but when I try to run it it is very slow when I try on the working example. For instance, the gemBS prepare -c example.conf -t example.csv command runs for several minutes and the mapping takes several hours. I am on a Lustre file system when I am experimenting such behaviour, it seems like a deadlock happening perhaps because of sqlite? Thanks for the help, Paul

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/heathsc/gemBS/issues/54, or mute the thread https://github.com/notifications/unsubscribe-auth/AAY4653CUVFQBCUD6NMEUQTPUL2SJANCNFSM4HLTCQUA.

heathsc avatar May 08 '19 16:05 heathsc

Hi Simon,

Using the --no-db option seems to solve the problem for the prepare step and the index step but for the mapping step it is freezing again. It is generating the sample1_data_a.json, gem_mapper_sample1_data_a.err and sample1_data_a.bam files and then nothing seems to happen (0% CPU use), even a ^C is taking time to quit.

Here is the log if it can help: gemBS --loglevel debug map --ignore-db 2019-05-08 16:14:25,770 DEBUG: Updating mapping table 2019-05-08 16:14:25,774 DEBUG: Updating extract table : : Command map started at 2019-05-08 16:14:25.754631 : : ------------ Mapping Parameters ------------ : Sample barcode : A001XS1 : Data set : sample1_data_a : No. threads : 8 : Index : indexes/sacCer3.BS.gem : Paired : True : Read non stranded : False : Reverse conversion: None : Type : PAIRED : Input Files : ./fastq/sample1/sample1_data_a_1.fastq.gz,./fastq/sample1/sample1_data_a_2.fastq.gz : Output dir : ./mapping/A001XS1 : : Bisulfite Mapping... 2019-05-08 16:14:25,778 DEBUG: Using bundled binary : /cvmfs/soft.mugqic/CentOS6/software/gemBS/gemBS-3.2.13/lib/python3.6/site-packages/gemBS/gemBSbinaries/gem-mapper 2019-05-08 16:14:26,250 DEBUG: Using bundled binary : /cvmfs/soft.mugqic/CentOS6/software/gemBS/gemBS-3.2.13/lib/python3.6/site-packages/gemBS/gemBSbinaries/readNameClean 2019-05-08 16:14:26,252 DEBUG: Using bundled binary : /cvmfs/soft.mugqic/CentOS6/software/gemBS/gemBS-3.2.13/lib/python3.6/site-packages/gemBS/bin/samtools 2019-05-08 16:14:26,252 INFO: Starting: /cvmfs/soft.mugqic/CentOS6/software/gemBS/gemBS-3.2.13/lib/python3.6/site-packages/gemBS/gemBSbinaries/gem-mapper -I indexes/sacCer3.BS.gem --i1 ./fastq/sample1/sample1_data_a_1.fastq.gz --i2 ./fastq/sample1/sample1_data_a_2.fastq.gz -p --bisulfite-conversion inferred-C2T-G2A -t 8 --report-file ./mapping/A001XS1/sample1_data_a.json -r @RG\tID:sample1_data_a\tSM:sample1\tBC:A001XS1\tPU:sample1_data_a\tLB:LB1S23\tPL:None --underconversion-sequence NC_001416.1 --overconversion-sequence NC_001604.1 | /cvmfs/soft.mugqic/CentOS6/software/gemBS/gemBS-3.2.13/lib/python3.6/site-packages/gemBS/gemBSbinaries/readNameClean | /cvmfs/soft.mugqic/CentOS6/software/gemBS/gemBS-3.2.13/lib/python3.6/site-packages/gemBS/bin/samtools sort -T ./mapping/A001XS1/sample1_data_a -@ 8 -o ./mapping/A001XS1/sample1_data_a.bam - 2019-05-08 16:14:26,253 DEBUG: Setting process log file to ./mapping/A001XS1/gem_mapper_sample1_data_a.err 2019-05-08 16:14:26,266 DEBUG: Starting subprocess 2019-05-08 16:14:26,276 DEBUG: Setting process log file to ./mapping/A001XS1/gem_mapper_sample1_data_a.err 2019-05-08 16:14:26,294 DEBUG: Setting process input to parent output 2019-05-08 16:14:26,294 DEBUG: Starting subprocess 2019-05-08 16:14:26,304 DEBUG: Setting process log file to ./mapping/A001XS1/gem_mapper_sample1_data_a.err 2019-05-08 16:14:26,307 DEBUG: Setting process input to parent output 2019-05-08 16:14:26,307 DEBUG: Starting subprocess

Thanks for the help, Paul

paulstretenowich avatar May 08 '19 19:05 paulstretenowich