dorado icon indicating copy to clipboard operation
dorado copied to clipboard

Corrupted aux data for read

Open yuxinPenny opened this issue 9 months ago • 1 comments

Issue Report

Please describe the issue:

I am running dorado for base calling and alignment. The base calling can run successfully, but when doing alignment, it always raise errors.

Run environment:

  • Dorado version: 0.6.1
  • Dorado command: ./dorado basecaller ./[email protected] pod5/ --emit-sam --emit-moves --reference hg38.fa -N 0 --mm2-preset splice --device cuda:0 > ./out.cram
  • Operating system:
  • Hardware (CPUs, Memory, GPUs): GPU
  • Source data type (e.g., pod5 or fast5 - please note we always recommend converting to pod5 for optimal basecalling performance): pod5
  • Source data location (on device or networked drive - NFS, etc.): on device
  • Details about data (flow cell, kit, read lengths, number of reads, total dataset size in MB/GB/TB): 10GB
  • Dataset to reproduce, if applicable (small subset of data to share as a pod5 to reproduce the issue):

Logs

[2024-05-10 10:59:42.005] [info] Running: "basecaller" "[email protected]" "pod5/" "--emit-sam" "--emit-moves" "--reference" "hg38.fa" "-N" "0" "--mm2-preset" "splice" "--device" "cuda:0" [2024-05-10 10:59:42.008] [info] > Creating basecall pipeline [2024-05-10 10:59:42.008] [warning] Ignoring '-N 0', using preset default [2024-05-10 10:59:42.021] [info] - BAM format does not support U, so RNA output files will include T instead of U for all file types. [2024-05-10 11:00:14.471] [info] cuda:0 using chunk size 10000, batch size 2496 [2024-05-10 11:00:15.217] [info] cuda:0 using chunk size 5000, batch size 5248 [E::bam_aux_next] Corrupted aux data for read ad199373-2a8e-4a01-ba24-2970b6a30907 [E::sam_format1_append] Corrupted aux data for read ad199373-2a8e-4a01-ba24-2970b6a30907 terminate called after throwing an instance of 'std::runtime_error' what(): Failed to write SAM record, error code -1 Aborted (core dumped)

yuxinPenny avatar May 10 '24 03:05 yuxinPenny

Hi @yuxinPenny,

Thanks for reporting this. Could you please try to the following to help us narrow this down:

  • Run with -v or -vv to provide more information
  • Remove the --emit-sam and/or --emit-moves flags
  • Extract that read from your pod5 files and verify that the issue still occurs
  • Add the extracted read pod5 here so we can try to reproduce this, please!

malton-ont avatar May 10 '24 10:05 malton-ont

Hi @yuxinPenny,

We've identified an issue with using the splice preset that is causing this problem. A fix is being implemented and will be included in a future release of dorado. Thanks for reporting this!

malton-ont avatar May 14 '24 16:05 malton-ont

Hi guys, Thanks for picking this up, just wanted to check if you have any sort of ETA on this fix please?

damioresegun avatar May 17 '24 08:05 damioresegun

Hi @yuxinPenny and @damioresegun - the fix for this is now available in dorado 0.7

tijyojwad avatar May 28 '24 16:05 tijyojwad