bwa-mem2 icon indicating copy to clipboard operation
bwa-mem2 copied to clipboard

The avx512bw binary from the ert branch core dumps

Open KBT59 opened this issue 4 years ago • 12 comments

I compiled ert branch bwa-mem2 using Intel oneAPI. I tried the avx512bw executable from that. It runs for several minutes and fails with free(): invalid pointer like this:

[0000] Calling mem_process_seqs.., task: 26

[0000] 1. Calling kt_for - worker_bwt

[0000] read_chunk: 160000000, work_chunk_size: 160000223, nseq: 1247564

    [0000][ M::kt_pipeline] read 1247564 sequences (160000223 bp)...

[0000] 2. Calling kt_for - worker_aln

[0000] Inferring insert size distribution of PE reads from data, l_pac: 3137161264, n: 1253988

[0000][PE] analyzing insert size distribution for orientation FF...

[0000][PE] (25, 50, 75) percentile: (106, 146, 198)

[0000][PE] low and high boundaries for computing mean and std.dev: (1, 382)

[0000][PE] mean and std.dev: (152.58, 62.44)

[0000][PE] low and high boundaries for proper pairs: (1, 474)

[0000][PE] analyzing insert size distribution for orientation FR...

[0000][PE] (25, 50, 75) percentile: (102, 155, 222)

[0000][PE] low and high boundaries for computing mean and std.dev: (1, 462)

[0000][PE] mean and std.dev: (167.60, 87.37)

[0000][PE] low and high boundaries for proper pairs: (1, 582)

[0000][PE] skip orientation RF as there are not enough pairs

[0000][PE] analyzing insert size distribution for orientation RR...

[0000][PE] (25, 50, 75) percentile: (116, 176, 278)

[0000][PE] low and high boundaries for computing mean and std.dev: (1, 602)

[0000][PE] mean and std.dev: (197.88, 109.28)

[0000][PE] low and high boundaries for proper pairs: (1, 764)

[0000][PE] skip orientation FF

[0000][PE] skip orientation RR

[0000] 3. Calling kt_for - worker_sam

free(): invalid pointer

Aborted (core dumped)

KBT59 avatar May 29 '21 13:05 KBT59

Are you using the ert index? Could you please share the command line?

arun-sub avatar May 29 '21 13:05 arun-sub

For example, I used command line ./bwa-mem2.avx512bw mem -c 250 -M -t 16 -v 1 /isilon/pm/brad-projects/TimeTrials/Index_for_ert_24MAY2021/hg19.fasta A01-TmCt01-50-20_S1_R1 _001.fastq.gz A01-TmCt01-50-20_S1_R2_001.fastq.gz > out4.sam

The index was made using the same binary.

This morning I used gdb and so far I’ve only found a reference to free(w->regs[i+1].a) at line 1540 in src/bwamem.cpp

Also, I tried building a static binary on a different machine using icpc under Fedora instead of Ubuntu. It did run and failed at the same point as I’ve seen the program fail so far.

I do get a lot of data out into the sam file before the program fails. I suspect it fails on the last reads in the fastq files.

Thanks for looking into this!

Best regards, Brad Thomas

From: Arun Subramaniyan @.> Sent: Saturday, May 29, 2021 9:52 AM To: bwa-mem2/bwa-mem2 @.> Cc: Brad Thomas @.>; Author @.> Subject: [EXTERNAL] Re: [bwa-mem2/bwa-mem2] The avx512bw binary from the ert branch core dumps (#148)

CAUTION: This email originated from outside the organization. DO NOT click links or open attachments unless you recognize the sender and know the content is safe.


Are you using the ert index? Could you please share the command line?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/bwa-mem2/bwa-mem2/issues/148#issuecomment-850837091, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIT5ZKOV7SOMGKY2QE2NCHLTQDWRNANCNFSM45YJN2JQ.

This communication and its attachments contain confidential information and is intended only for the named addressee. If you are not the named addressee you should not disseminate, distribute or copy this communication. Please notify the sender immediately if you have received this communication by mistake and delete or destroy this communication. Communications cannot be guaranteed to be secured or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this communication which arise as a result of transmission. If verification is required please request a hard-copy version. NeoGenomics Laboratories, 12701 Commonwealth Dr, Fort Myers, FL 33913, http://www.neogenomics.com (2021)

KBT59 avatar May 30 '21 17:05 KBT59

Thanks Brad. I think you forgot to add -Z before the index prefix in the command line. You need to use -Z to indicate that you are using the ERT index instead of the default index for BWA-MEM2.

Could you try -Z /isilon/pm/brad-projects/TimeTrials/Index_for_ert_24MAY2021/hg19.fasta ?

--Arun

arun-sub avatar May 30 '21 17:05 arun-sub

And just to double check, can you also make sure /isilon/pm/brad-projects/TimeTrials/Index_for_ert_24MAY2021/hg19.fasta.mlt_table and /isilon/pm/brad-projects/TimeTrials/Index_for_ert_24MAY2021/hg19.fasta.kmer_table exist ?

arun-sub avatar May 30 '21 17:05 arun-sub

I do see the hg19.fasta.mlt_table and hg19.fasta.kmer_table after I rebuilt the index. I then ran

./bwa-mem2.avx512bw mem -K 100000000 -c 250 -M -t 16 -v 1 -Z /isilon /pm/brad-projects/TimeTrials/AVX512BW_ERT_INDEX/hg19.fasta A01-TmCt01-50-20_S1_R1_001.fastq.gz A01-TmCt01-50-20_S1_R2_001.fastq.gz > out6.sam

The program fails (dumps core) still with the same comment: free(): invalid pointer

KBT59 avatar May 31 '21 01:05 KBT59

Could you share your complete run log ? I will take a look. Thanks !

arun-sub avatar May 31 '21 01:05 arun-sub

I'm naive. Where is the run log?

KBT59 avatar May 31 '21 02:05 KBT59

If you could redirect stderr to a file and attach it here, that will be great.

./bwa-mem2.avx512bw mem -K 100000000 -c 250 -M -t 16 -v 1 -Z /isilon /pm/brad-projects/TimeTrials/AVX512BW_ERT_INDEX/hg19.fasta A01-TmCt01-50-20_S1_R1_001.fastq.gz A01-TmCt01-50-20_S1_R2_001.fastq.gz > out6.sam 2> out6.log

Could you also share the dataset or a small part of it that fails ? It will help reproduce what you are seeing. If not possible, that's ok.

arun-sub avatar May 31 '21 02:05 arun-sub

The fastq files are somewhat large. How may I get them to you?

KBT59 avatar May 31 '21 02:05 KBT59

Could you drop a message to [email protected] ? I will send you a link to upload. Thanks !

arun-sub avatar May 31 '21 02:05 arun-sub

The issue was resolved. Thanks, Arun! We see a 1.75x speed improvement over the main branch of bwa-mem2. Output has very high fidelity to bwa mem 0.7.17.

KBT59 avatar Jun 08 '21 12:06 KBT59

Thanks Brad. Let me know if you would like me to take a closer look at any discrepancies.

arun-sub avatar Jun 08 '21 13:06 arun-sub