dorado
dorado copied to clipboard
Corrupted aux data for read
Issue Report
Please describe the issue:
I am running dorado for base calling and alignment. The base calling can run successfully, but when doing alignment, it always raise errors.
Run environment:
- Dorado version: 0.6.1
- Dorado command: ./dorado basecaller ./[email protected] pod5/ --emit-sam --emit-moves --reference hg38.fa -N 0 --mm2-preset splice --device cuda:0 > ./out.cram
- Operating system:
- Hardware (CPUs, Memory, GPUs): GPU
- Source data type (e.g., pod5 or fast5 - please note we always recommend converting to pod5 for optimal basecalling performance): pod5
- Source data location (on device or networked drive - NFS, etc.): on device
- Details about data (flow cell, kit, read lengths, number of reads, total dataset size in MB/GB/TB): 10GB
- Dataset to reproduce, if applicable (small subset of data to share as a pod5 to reproduce the issue):
Logs
[2024-05-10 10:59:42.005] [info] Running: "basecaller" "[email protected]" "pod5/" "--emit-sam" "--emit-moves" "--reference" "hg38.fa" "-N" "0" "--mm2-preset" "splice" "--device" "cuda:0"
[2024-05-10 10:59:42.008] [info] > Creating basecall pipeline
[2024-05-10 10:59:42.008] [warning] Ignoring '-N 0', using preset default
[2024-05-10 10:59:42.021] [info] - BAM format does not support U
, so RNA output files will include T
instead of U
for all file types.
[2024-05-10 11:00:14.471] [info] cuda:0 using chunk size 10000, batch size 2496
[2024-05-10 11:00:15.217] [info] cuda:0 using chunk size 5000, batch size 5248
[E::bam_aux_next] Corrupted aux data for read ad199373-2a8e-4a01-ba24-2970b6a30907
[E::sam_format1_append] Corrupted aux data for read ad199373-2a8e-4a01-ba24-2970b6a30907
terminate called after throwing an instance of 'std::runtime_error'
what(): Failed to write SAM record, error code -1
Aborted (core dumped)
Hi @yuxinPenny,
Thanks for reporting this. Could you please try to the following to help us narrow this down:
- Run with
-v
or-vv
to provide more information - Remove the
--emit-sam
and/or--emit-moves
flags - Extract that read from your pod5 files and verify that the issue still occurs
- Add the extracted read pod5 here so we can try to reproduce this, please!
Hi @yuxinPenny,
We've identified an issue with using the splice preset that is causing this problem. A fix is being implemented and will be included in a future release of dorado. Thanks for reporting this!
Hi guys, Thanks for picking this up, just wanted to check if you have any sort of ETA on this fix please?
Hi @yuxinPenny and @damioresegun - the fix for this is now available in dorado 0.7