racon icon indicating copy to clipboard operation
racon copied to clipboard

Racon GPU floating point exeception

Open MRRedlinger opened this issue 4 years ago • 2 comments

I'm encountering a floating point exception after the GPU memory allocation step on most, but not all, of my attempts to polish a reference with racon using the GPU. There are no issues when the -c flag isn't included.

I recently installed the GPU accelerated version of racon, per the instructions on this github page. I'm attempting to polish a 9kb reference genome using about 800k ONT reads.

I mapped the reads to the reference with minimap2, and passed the sam file along with the reads and reference to racon. For this polishing, the GPU accelerated racon worked perfectly.

For the second round of polishing I mapped the reads to the racon output from the previous round and passed the sam file, racon output and raw reads to racon. This time I encountered the floating point error.

This error is repeatable, with the reads above I can perform the 1st round of polishing repeatedly without error but the second round of polishing will always produce this error. For other barcodes I will get the error when attempting the first round of polishing.

We are using ubuntu 18.04 on with an AMD epyc CPU. There are two GPU's installed in the system, a Quadro RTX 6000 and an RTX 2080ti. racon appears to allocate memory on both when running with the -c flag.

MRRedlinger avatar Apr 05 '21 18:04 MRRedlinger

Hello, when you do not use the -c flag, are you using any other CUDA options like --cudaaligner-batches? If not, than it is GPU Racon issue. @tijyojwad, could please inspect this?

Best regards, Robert

rvaser avatar Apr 08 '21 03:04 rvaser

No, when I'm not using the -c option I am only using the CPU.

Thank you, Matthew

MRRedlinger avatar Apr 08 '21 03:04 MRRedlinger