WFA
WFA copied to clipboard
Memory issues with WFA2 and BiWFA while processing millions of paired alignments
Hi WFA2 and BiWFA team,
Thank you for making these fast and efficient libraries available for everyone. I am trying to implement the library in my project and has been successful for small data with 2-3 million read but when I am using this for large data with >5 millions of reads it through out segmentation fault. I have 128GB ram on the system. After testing different 'attributes.memory_mode' setting I found out that the BiWFA runs out of memory after processing certain number of runs and through out segmentation fault error.
Project I am working on involves doing pairwise comparison of n millions of DNA queries ( illumina reads) to m different reference sequences (amplicons). I am calling function
std::string nw_function(std::string refseq, std::string query){ char *pattern; char text; pattern = &refseq[0]; text = &query[0]; // Configure alignment attributes wavefront_aligner_attr_t attributes = wavefront_aligner_attr_default; attributes.distance_metric = gap_affine; attributes.alignment_form.span = alignment_end2end;// alignment_end2end; attributes.affine_penalties.match = 0; attributes.affine_penalties.mismatch = 4; attributes.affine_penalties.gap_opening = 20; attributes.affine_penalties.gap_extension = 2; attributes.memory_mode = wavefront_memory_ultralow; // Initialize Wavefront Aligner wavefront_aligner_t const wf_aligner = wavefront_aligner_new(&attributes); // Align wavefront_bialign(wf_aligner,pattern,refseq.length(),text,refseq.length()); std::string mycig = get_cigar_string(wf_aligner->cigar,true); // Free wavefront_aligner_delete(wf_aligner); return mycig; }
I tried using your WFA library and encountered similar issues with much lower read processing capacity. For this reason I moved to your BiWFA library which significantly improved the read capacity but not enough to solve the problem. I was hoping if you could help identify solution to the problem I am facing. Can you give some idea about what parameters I would need to modify so BiWFA does not run out of memory. Greatly appreciate your help.