DRAGMAP icon indicating copy to clipboard operation
DRAGMAP copied to clipboard

Dragmap failed ERROR: This thread caught an exception first

Open quentin67100 opened this issue 3 years ago • 27 comments

Hi,

I want to test dragmap (currently I'm using Bwa mem2) but I get an error. First precision : I use dragmap with a Conda env, the last version. Command used:

dragen-os \
-r ${REF_Genome} \
-1 ${Fastq_DIR}/${read1} \
-2 ${Fastq_DIR}/${read2} \
--RGID HG001 \
--RGSM HG001 \
--num-threads ${CPU_number} \
| samtools view \
-b \
-h \
-L ${BED} \
-@ 2 \
> ${Align_DIR}/${ID}.trimmed.align.filtered.bam 2> ${Align_DIR}/logs/${ID}.trimmed.align.filtered.log

It failed after less than 2 minutes. At first it seems to work normally, in multithreading. But then it only uses one thread and it ends up failing. I get a start of bam with aligned reads.

I am using 11 threads and I have 90G of memory.

The log file:

2021-10-20 18:51:23 [2ba35fd534c0] Version: 1.2.1 2021-10-20 18:51:23 [2ba35fd534c0] argc: 13 argv: dragen-os -r /shared/projects/gentaumix/dragen/reference -1 /shared/projects/gentaumix/HG001/02_Trimming/fastq_drag/HG001.trimmed.R1.fastq.gz -2 /shared/projects/gentaumix/HG001/02_Trimming/fastq_drag/HG001.trimmed.R2.fastq.gz --RGID HG001 --RGSM HG001 --num-threads 11 decompHashTableCtxInit... 0.824 seconds decompHashTableHeader... 0.002 seconds decompHashTableLiterals... 1.926 seconds decompHashTableExtIndex... 0.041 seconds decompHashTableAutoHits... 44.869 seconds decompHashTableSetFlags... 6.205 seconds finished decompress Running dual fastq workflow on 11 threads. System supports 56 threads. 0 249 0 0 0 0 10000 1 40000 1 1000 0 0 0 6 0 250 0 0 0 0 10000 1 40000 1 1000 0 0 0 5 0 251 0 0 0 0 10000 1 40000 1 1000 0 0 0 4 0 252 0 0 0 0 10000 1 40000 1 1000 0 0 0 3 0 253 0 0 0 0 10000 1 40000 1 1000 0 0 0 2 0 254 0 0 0 0 10000 1 40000 1 1000 0 0 0 1 0 0 271 361 490 392.361 158.769 1 1147 1 789 89456 90372 0 0 Initial paired-end statistics detected for read group all, based on 89456 high quality pairs for FR orientation Quartiles (25 50 75) = 271 361 490 Mean = 392.361 Standard deviation = 158.769 Rescue radius = 396.924 Effective rescue sigmas = 2.5 Boundaries for mean and standard deviation: low = 1, high = 928 Boundaries for proper pairs: low = 1, high = 1147 NOTE: DRAGEN's insert estimates include corrections for clipping (so they are not identical to TLEN) [47982249010944] ERROR: This thread caught an exception first

Other precision: I get exactly the same error if I send the results of dragmap in a sam file instead of Samtools view. other precision : I also tried the version 1.2.0 with the same error

how to solve it?

quentin67100 avatar Oct 20 '21 17:10 quentin67100

Hi, If that is some public data, could you share the input fastq so that I can replicate the error ?

rizkg avatar Oct 21 '21 12:10 rizkg

Hi, If that is some public data, could you share the input fastq so that I can replicate the error ?

It's fastq from this accession : SRR14724533 I used fastp with the default options + poly g tail trimming. Then i used the fastq output of fastp as an input for dragmap.

quentin67100 avatar Oct 21 '21 12:10 quentin67100

Thanks for the info ! I'll get back to you when I have news.

rizkg avatar Oct 21 '21 12:10 rizkg

Thanks for the info ! I'll get back to you when I have news.

I just tested dragmap on the fastq of this accession directly (without using fastp) and it works. On the other hand it's quite slow (16h for a 30x human genome, with equivalent resources and on this sample bwa mem 2 takes 7.2h) but I suppose that this is the kind of thing that will improve with the next versions .

quentin67100 avatar Oct 22 '21 13:10 quentin67100

Hi, we were able to replicate the issue and found the cause, a fix will be there soon.

rizkg avatar Nov 04 '21 17:11 rizkg

Hi folks. I'm seeing the same error using some private in-house whole genome data:

dragen-os -r hg38_no_alt_dragmap_ref -b B46157_4_lanes_dupsFlagged.bam
2022-01-10 14:27:53 	[7f15d6033740]	Version: 1.2.1
2022-01-10 14:27:53 	[7f15d6033740]	argc: 5 argv: dragen-os -r hg38_no_alt_dragmap_ref -b B46157_4_lanes_dupsFlagged.bam
decompHashTableCtxInit...
  1.184 seconds
decompHashTableHeader...
  0.002 seconds
decompHashTableLiterals...
  3.299 seconds
decompHashTableExtIndex...
  0.094 seconds
decompHashTableAutoHits...
  24.441 seconds
decompHashTableSetFlags...
  2.636 seconds
finished decompress
Running fastq workflow on 144 threads. System supports 144 threads.
0	249	0	0	0	0	10000	1	40000	1	1000	0	0	0	6	
0	250	0	0	0	0	10000	1	40000	1	1000	0	0	0	5	
0	251	0	0	0	0	10000	1	40000	1	1000	0	0	0	4	
0	252	0	0	0	0	10000	1	40000	1	1000	0	0	0	3	
0	253	0	0	0	0	10000	1	40000	1	1000	0	0	0	2	
0	254	0	0	0	0	10000	1	40000	1	1000	0	0	0	1	
[139729232258816]	ERROR: This thread caught an exception first

I see that error after about 3 hours of runtime and the processes seem to hang and never return. I installed this version through conda. is there a recommended workaround?

RichardCorbett avatar Jan 11 '22 00:01 RichardCorbett

Hi Richard, We were able to find and fix this bug, which arises for the mapping of some very short reads. We will publish the fix on this repo very soon, Best, Guillaume

rizkg avatar Jan 11 '22 08:01 rizkg

Hi, A fix for this issue has been pushed to the master branch. Could you try again with latest version from master on your data and check it fixed the bug you had ? Guillaume

rizkg avatar Jan 20 '22 13:01 rizkg

Hi there.
I installed from master but got the same error again:

for b in $(ls *bam); do echo "/gsc/software/linux-x86_64-centos7/dragmap-1.2.1-5/bin/dragen-os -r hg38_no_alt_dragmap_ref -b ${b}  > ${b}_dragmap.sam"; done  | bash -x
+ /gsc/software/linux-x86_64-centos7/dragmap-1.2.1-5/bin/dragen-os -r hg38_no_alt_dragmap_ref -b B46157_4_lanes_dupsFlagged.bam
2022-01-20 12:33:03 	[7f177c99f7c0]	Version: 1.2.1-5-gf36d7849
2022-01-20 12:33:03 	[7f177c99f7c0]	argc: 5 argv: /gsc/software/linux-x86_64-centos7/dragmap-1.2.1-5/bin/dragen-os -r hg38_no_alt_dragmap_ref -b B46157_4_lanes_dupsFlagged.bam
decompHashTableCtxInit...
  1.741 seconds
decompHashTableHeader...
  0.002 seconds
decompHashTableLiterals...
  3.795 seconds
decompHashTableExtIndex...
  0.077 seconds
decompHashTableAutoHits...
  28.186 seconds
decompHashTableSetFlags...
  3.060 seconds
finished decompress
Running fastq workflow on 144 threads. System supports 144 threads.
0	249	0	0	0	0	10000	1	40000	1	1000	0	0	0	6	
0	250	0	0	0	0	10000	1	40000	1	1000	0	0	0	5	
0	251	0	0	0	0	10000	1	40000	1	1000	0	0	0	4	
0	252	0	0	0	0	10000	1	40000	1	1000	0	0	0	3	
0	253	0	0	0	0	10000	1	40000	1	1000	0	0	0	2	
0	254	0	0	0	0	10000	1	40000	1	1000	0	0	0	1	
[139737098520320]	ERROR: This thread caught an exception first

RichardCorbett avatar Jan 20 '22 21:01 RichardCorbett

Hi, thanks for checking. I am working on it.

rizkg avatar Jan 21 '22 15:01 rizkg

Hi, a new fix was pushed to master branch. Could you check again on your data ? Thanks, Guillaume

rizkg avatar Feb 03 '22 15:02 rizkg

Looks like I still get an error:

dragen-os -r hg38_no_alt_dragmap_ref -b B46157_4_lanes_dupsFlagged.bam
2022-02-03 10:28:41 	[7f320a6287c0]	Version: 1.2.1-7-gc87d93aa
2022-02-03 10:28:41 	[7f320a6287c0]	argc: 5 argv: /gsc/software/linux-x86_64-centos7/dragmap-1.2.1-7/bin/dragen-os -r hg38_no_alt_dragmap_ref -b B46157_4_lanes_dupsFlagged.bam
decompHashTableCtxInit...
  1.505 seconds
decompHashTableHeader...
  0.002 seconds
decompHashTableLiterals...
  3.205 seconds
decompHashTableExtIndex...
  0.070 seconds
decompHashTableAutoHits...
  23.794 seconds
decompHashTableSetFlags...
  1.850 seconds
finished decompress
Running fastq workflow on 144 threads. System supports 144 threads.
0	249	0	0	0	0	10000	1	40000	1	1000	0	0	0	6	
0	250	0	0	0	0	10000	1	40000	1	1000	0	0	0	5	
0	251	0	0	0	0	10000	1	40000	1	1000	0	0	0	4	
0	252	0	0	0	0	10000	1	40000	1	1000	0	0	0	3	
0	253	0	0	0	0	10000	1	40000	1	1000	0	0	0	2	
0	254	0	0	0	0	10000	1	40000	1	1000	0	0	0	1	
[139851179972352]	ERROR: This thread caught an exception first

RichardCorbett avatar Feb 03 '22 20:02 RichardCorbett

I have permission to share the data with you if it helps

RichardCorbett avatar Feb 03 '22 20:02 RichardCorbett

Hi Richard, Yes that would be very helpful ! How big is it ?

rizkg avatar Feb 03 '22 20:02 rizkg

To share the bam and reference i am using it would be about 52Gb.

RichardCorbett avatar Feb 03 '22 20:02 RichardCorbett

Hi @rizkg , Have you had any luck reproducing my error? I am getting some pressures at my center to have this up and running, so please let me know if there is anything else I can provide.

RichardCorbett avatar Feb 16 '22 23:02 RichardCorbett

Also, do you think it may help if I try running your binary directly (or in a container?)

RichardCorbett avatar Feb 16 '22 23:02 RichardCorbett

Hello, Yes I have been able to reproduce the error. It does not seem to come from your hashtable or from your binary. The problem seems to be in the bam parsing code. As a temporary workaround, you could first convert your bam to fastq, e.g. samtools bam2fq B46157_4_lanes_dupsFlagged.bam | gzip > file.fastq.gz And then run dragmap with this fastq file, e.g. dragen-os -r hg38_no_alt_dragmap_ref -1 file.fastq.gz --output-directory ./ --output-file-prefix B46157 I'll keep you posted as soon as we have a fix for this.

rizkg avatar Feb 17 '22 13:02 rizkg

Thanks. Trying it out now.

RichardCorbett avatar Feb 17 '22 15:02 RichardCorbett

Hello again, Forget what I said before, that would give you single-end mapping. The issue is because we do not support bam input sorted by coordinate, it should be sorted by read names. So you should do, e.g.

samtools sort --threads 16 -n B46157_4_lanes_dupsFlagged.bam > B46157_4_lanes_dupsFlagged_name_sorted.bam

And then use the name sorted bam as dragmap input, and specify --interleaved true in the dragmap options to have paired mapping. We'll add a proper check and error message for this problem.

rizkg avatar Feb 17 '22 17:02 rizkg

I'm having the same problem. Only that I'm inputting paired fastq files instead of bam file.
The command looks like this: dragen-os -r /paedyl01/disk1/yangyxt/indexed_genome/hg19 -1 /paedyl01/disk1/yangyxt/wgs/9_samples_20201202/trimmed_sequences/A 160792B_1_val_1.fq.gz -2 /paedyl01/disk1/yangyxt/wgs/9_samples_20201202/trimmed_sequences/A160792B_2_val_2.fq.gz --num-threads 23 --Aligner.sec-aligns 5 --fastq-offset 30 --Aligner. sw-method dragen --verbose --RGID A160792B --RGSM A160792B --output-directory /paedyl01/disk1/yangyxt/wgs/9_samples_20201202/aligned_results --output-file-prefix A160792B

And here is the error log: 2022-04-22 17:37:05 [2b2fe2e5ee00] Version: 1.2.1 2022-04-22 17:37:05 [2b2fe2e5ee00] argc: 24 argv: dragen-os -r /paedyl01/disk1/yangyxt/indexed_genome/hg19 -1 /paedyl01/disk1/yangyxt/wgs/9_samples_20201202/trimmed_sequences/A 160792B_1_val_1.fq.gz -2 /paedyl01/disk1/yangyxt/wgs/9_samples_20201202/trimmed_sequences/A160792B_2_val_2.fq.gz --num-threads 23 --Aligner.sec-aligns 5 --fastq-offset 30 --Aligner. sw-method dragen --verbose --RGID A160792B --RGSM A160792B --output-directory /paedyl01/disk1/yangyxt/wgs/9_samples_20201202/aligned_results --output-file-prefix A160792B.bqsr decompHashTableCtxInit... 1.133 seconds decompHashTableHeader... 0.001 seconds decompHashTableLiterals... 1.627 seconds decompHashTableExtIndex... 0.044 seconds decompHashTableAutoHits... 19.191 seconds decompHashTableSetFlags... 1.453 seconds finished decompress INFO: writing SAM file to "/paedyl01/disk1/yangyxt/wgs/9_samples_20201202/aligned_results/A160792B.bqsr.sam" INFO: writing mapping metrics stats into "/paedyl01/disk1/yangyxt/wgs/9_samples_20201202/aligned_results/A160792B.bqsr.mapping_metrics.csv" INFO: writing insert stats into "/paedyl01/disk1/yangyxt/wgs/9_samples_20201202/aligned_results/A160792B.bqsr.insert-stats.tab" Running dual fastq workflow on 23 threads. System supports 80 threads. Initial paired-end statistics detected for read group all, based on 88335 high quality pairs for FR orientation Quartiles (25 50 75) = 233 300 373 Mean = 304.777 Standard deviation = 106.367 Rescue radius = 265.917 Effective rescue sigmas = 2.5 Boundaries for mean and standard deviation: low = 1, high = 653 Boundaries for proper pairs: low = 1, high = 793 NOTE: DRAGEN's insert estimates include corrections for clipping (so they are not identical to TLEN) [47523547105024] ERROR: This thread caught an exception first /paedyl01/disk1/yangyxt/ngs_scripts/common_bash_utils.sh: line 3651: 297460 Segmentation fault (core dumped) dragen-os -r ${ref_genome_dir} -1 ${forward_reads} -2 ${reverse_rea ds} --num-threads ${threads} --Aligner.sec-aligns 5 --fastq-offset 30 --Aligner.sw-method dragen --verbose --RGID ${samp_ID} --RGSM ${samp_ID} --output-directory $(dirname ${output_ align}) --output-file-prefix $(basename ${output_align/.bam/})

yangyxt avatar Apr 22 '22 09:04 yangyxt

Sorry I dunno why the text wrap is disabled... I'll paste the key lines from the error log down below: Initial paired-end statistics detected for read group all, based on 88335 high quality pairs for FR orientation Quartiles (25 50 75) = 233 300 373 Mean = 304.777 Standard deviation = 106.367 Rescue radius = 265.917 Effective rescue sigmas = 2.5 Boundaries for mean and standard deviation: low = 1, high = 653 Boundaries for proper pairs: low = 1, high = 793 NOTE: DRAGEN's insert estimates include corrections for clipping (so they are not identical to TLEN) [47523547105024] ERROR: This thread caught an exception first /paedyl01/disk1/yangyxt/ngs_scripts/common_bash_utils.sh: line 3651: 297460 Segmentation fault (core dumped)

yangyxt avatar Apr 22 '22 09:04 yangyxt

Hi, Thanks for your report. Although this is same error message as previous error reports in this thread, I am not sure this has a common cause. We are working on reporting more meaningful error messages. Meanwhile, would you be able to share your input files ?

rizkg avatar May 03 '22 14:05 rizkg

Thank you for the response! I'm not sure I can. Even if I want to, the FASTQ files are huge since they are WGS samples.

yangyxt avatar May 07 '22 15:05 yangyxt

Hi I am facing the same issue. I am aligning my short reads to SARS-Cov-2 reference genome.

dragen-os --num-threads 10 -r results/04_alignDRAGMAP/index/dragmapidx -1 data/S9_1.fastq.gz -2 data/S9_2.fastq.gz > temp.sam

2022-07-28 19:30:32 	[14afcfe29740]	Version: 1.3.0
2022-07-28 19:30:32 	[14afcfe29740]	argc: 9 argv: dragen-os --num-threads 10 -r results/04_alignDRAGMAP/index/dragmapidx -1 data/S9_1.fastq.gz -2 data/S9_2.fastq.gz
decompHashTableCtxInit...
  0.000 seconds
decompHashTableHeader...
  0.002 seconds
decompHashTableLiterals...
  0.004 seconds
decompHashTableExtIndex...
  0.000 seconds
decompHashTableAutoHits...
  0.010 seconds
decompHashTableSetFlags...
  0.004 seconds
finished decompress
Running dual fastq workflow on 10 threads. System supports 112 threads.
0	249	0	0	0	0	10000	1	40000	1	1000	0	0	0	6	
0	250	0	0	0	0	10000	1	40000	1	1000	0	0	0	5	
0	251	0	0	0	0	10000	1	40000	1	1000	0	0	0	4	
0	252	0	0	0	0	10000	1	40000	1	1000	0	0	0	3	
0	253	0	0	0	0	10000	1	40000	1	1000	0	0	0	2	
0	254	0	0	0	0	10000	1	40000	1	1000	0	0	0	1	
Segmentation fault (core dumped)

The index was created using

samtools faidx dragmapidx/$fasta 
gatk CreateSequenceDictionary -R dragmapidx/$fasta	
dragen-os --build-hash-table true --ht-reference dragmapidx/$fasta  --output-directory dragmapidx --ht-num-threads 20
gatk ComposeSTRTableFile -R dragmapidx/$fasta -O dragmapidx/str_table.tsv

and the directory look like

hash_table.cfg
hash_table.cfg.bin
hash_table.cmp
hash_table_stats.txt
reference.bin
ref_index.bin
repeat_mask.bin
sequence.dict
sequence.fasta
sequence.fasta.fai
str_table.bin
str_table.tsv

Rohit-Satyam avatar Jul 28 '22 16:07 Rohit-Satyam