dorado icon indicating copy to clipboard operation
dorado copied to clipboard

Values of the mean_qscore_template field in sequencing_summary.txt files are integers instead of floats

Open jourdren opened this issue 1 year ago • 1 comments

Issue Report

Please describe the issue:

Values of the mean_qscore_template field in sequencing_summary.txt files are integers while floats are expected for means like with Guppy.

With integers instead of floats, PHRED score distribution may look not continuous with some QC tools like ToulligQC (I am one of the developer of ToulligQC) as you can see in the following screenshot. Screenshot

Steps to reproduce the issue:

Just launch the dorado summary command.

Run environment:

  • Dorado version: 0.6.0
  • Dorado command: dorado summary mybam.bam
  • Operating system: Linux

jourdren avatar Apr 16 '24 15:04 jourdren

I encounter the same problem in drawing mean qscore distribution. In dorado source code: dorado/read_pipeline/ReadPipeline.cpp In function: ReadCommon::generate_read_tags int qs = static_cast(std::round(calculate_mean_qscore())); bam_aux_append(aln, "qs", 'i', sizeof(qs), (uint8_t *)&qs);

It seemed that mean qscore were transformed to integer by std::round before writting to bam.

wzboy1984 avatar May 07 '24 15:05 wzboy1984

This has been resolved in Dorado 0.7.0.

Happy basecalling!

HalfPhoton avatar May 21 '24 17:05 HalfPhoton