gemBS
gemBS copied to clipboard
Slowness when running
Hi, I have installed gemBS but when I try to run it it is very slow when I try on the working example.
For instance, the gemBS prepare -c example.conf -t example.csv
command runs for several minutes and the mapping takes several hours.
I am on a Lustre file system when I am experimenting such behaviour, it seems like a deadlock happening perhaps because of sqlite?
Thanks for the help,
Paul
Hi Paul,
Our cluster has a Lustre FS for bulk storage and a small shared NFS filesystem for home directories. I tend to run gemBS from the NFS filesystem, but with all of the data (input and output) being on Lustre with only the sqlite db on NFS. You can also try running with the --no-db option to gemBS prepare if you suspect that this is an interaction between sqlite and Lustre. In this case gemBS will not use an on disk database at all. If this option is used then you have to be careful running multiple gemBS jobs in parallel.
Hope this helps, Simon
On 8 May 2019, at 18:02, Paul STRETENOWICH [email protected] wrote:
Hi, I have installed gemBS but when I try to run it it is very slow when I try on the working example. For instance, the gemBS prepare -c example.conf -t example.csv command runs for several minutes and the mapping takes several hours. I am on a Lustre file system when I am experimenting such behaviour, it seems like a deadlock happening perhaps because of sqlite? Thanks for the help, Paul
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/heathsc/gemBS/issues/54, or mute the thread https://github.com/notifications/unsubscribe-auth/AAY4653CUVFQBCUD6NMEUQTPUL2SJANCNFSM4HLTCQUA.
Hi Simon,
Using the --no-db
option seems to solve the problem for the prepare
step and the index
step but for the mapping step it is freezing again. It is generating the sample1_data_a.json, gem_mapper_sample1_data_a.err and sample1_data_a.bam files and then nothing seems to happen (0% CPU use), even a ^C
is taking time to quit.
Here is the log if it can help:
gemBS --loglevel debug map --ignore-db
2019-05-08 16:14:25,770 DEBUG: Updating mapping table
2019-05-08 16:14:25,774 DEBUG: Updating extract table
:
: Command map started at 2019-05-08 16:14:25.754631
:
: ------------ Mapping Parameters ------------
: Sample barcode : A001XS1
: Data set : sample1_data_a
: No. threads : 8
: Index : indexes/sacCer3.BS.gem
: Paired : True
: Read non stranded : False
: Reverse conversion: None
: Type : PAIRED
: Input Files : ./fastq/sample1/sample1_data_a_1.fastq.gz,./fastq/sample1/sample1_data_a_2.fastq.gz
: Output dir : ./mapping/A001XS1
:
: Bisulfite Mapping...
2019-05-08 16:14:25,778 DEBUG: Using bundled binary : /cvmfs/soft.mugqic/CentOS6/software/gemBS/gemBS-3.2.13/lib/python3.6/site-packages/gemBS/gemBSbinaries/gem-mapper
2019-05-08 16:14:26,250 DEBUG: Using bundled binary : /cvmfs/soft.mugqic/CentOS6/software/gemBS/gemBS-3.2.13/lib/python3.6/site-packages/gemBS/gemBSbinaries/readNameClean
2019-05-08 16:14:26,252 DEBUG: Using bundled binary : /cvmfs/soft.mugqic/CentOS6/software/gemBS/gemBS-3.2.13/lib/python3.6/site-packages/gemBS/bin/samtools
2019-05-08 16:14:26,252 INFO: Starting:
/cvmfs/soft.mugqic/CentOS6/software/gemBS/gemBS-3.2.13/lib/python3.6/site-packages/gemBS/gemBSbinaries/gem-mapper -I indexes/sacCer3.BS.gem --i1 ./fastq/sample1/sample1_data_a_1.fastq.gz --i2 ./fastq/sample1/sample1_data_a_2.fastq.gz -p --bisulfite-conversion inferred-C2T-G2A -t 8 --report-file ./mapping/A001XS1/sample1_data_a.json -r @RG\tID:sample1_data_a\tSM:sample1\tBC:A001XS1\tPU:sample1_data_a\tLB:LB1S23\tPL:None --underconversion-sequence NC_001416.1 --overconversion-sequence NC_001604.1 | /cvmfs/soft.mugqic/CentOS6/software/gemBS/gemBS-3.2.13/lib/python3.6/site-packages/gemBS/gemBSbinaries/readNameClean | /cvmfs/soft.mugqic/CentOS6/software/gemBS/gemBS-3.2.13/lib/python3.6/site-packages/gemBS/bin/samtools sort -T ./mapping/A001XS1/sample1_data_a -@ 8 -o ./mapping/A001XS1/sample1_data_a.bam -
2019-05-08 16:14:26,253 DEBUG: Setting process log file to ./mapping/A001XS1/gem_mapper_sample1_data_a.err
2019-05-08 16:14:26,266 DEBUG: Starting subprocess
2019-05-08 16:14:26,276 DEBUG: Setting process log file to ./mapping/A001XS1/gem_mapper_sample1_data_a.err
2019-05-08 16:14:26,294 DEBUG: Setting process input to parent output
2019-05-08 16:14:26,294 DEBUG: Starting subprocess
2019-05-08 16:14:26,304 DEBUG: Setting process log file to ./mapping/A001XS1/gem_mapper_sample1_data_a.err
2019-05-08 16:14:26,307 DEBUG: Setting process input to parent output
2019-05-08 16:14:26,307 DEBUG: Starting subprocess
Thanks for the help, Paul