dorado icon indicating copy to clipboard operation
dorado copied to clipboard

duration information of base calls

Open daniel-es6 opened this issue 2 months ago • 3 comments

New issue checks

Dorado subcommand

Basecaller

Feature request

Is it possible to find the raw signal durations of each base call? The moves table is based on the signal segmentations, and it seems there is no obvious way to find how the segmentation was done by dorado.

daniel-es6 avatar Nov 05 '25 01:11 daniel-es6

Hi @daniel-es6,

This information is contained in the ts (trimmed samples) and ns (number of samples) tags. See https://software-docs.nanoporetech.com/dorado/latest/basecaller/sam_spec/#read-tags:

the basecalled sequence corresponds to the interval signal[ts : ns] the move table maps to the same interval. note that ns reflects trimming (if any) from the rear of the signal.

malton-ont avatar Nov 05 '25 10:11 malton-ont

Thanks for getting back, this was useful. Looking at a read with move tables and found this : "ns:i:55330 ts:i:1714". Does it mean the raw signals from 1714 to 55330 in the pod5 file were used? Another related question, does each move always have the same number of raw signals?

daniel-es6 avatar Nov 06 '25 16:11 daniel-es6

@daniel-es6,

Yes, that's exactly what that means.

The move table describes the signal to base mapping. The first element is the stride - the number of samples per entry. Each entry in the move table is either a 0 (no new base) or a 1 (new base). So for a move table that looks like:

mv:B:c,5,1,0,0,1,0,1

The first entry is a 5, so each entry is 5 samples. There are 3x1, which should match 3 bases in the sequence. The first 1 occurs in the first event, so the first base corresponds to the first 5 samples. There are then 2x0 before the next 1, so the second base corresponds to the next 15 samples, and the last base is 10 samples.

See https://github.com/nanoporetech/dorado/blob/release-v1.2/dorado/utils/sequence_utils.cpp#L251 for how dorado converts a move table into a map of signal points for each base.

malton-ont avatar Nov 06 '25 16:11 malton-ont