nextclade
nextclade copied to clipboard
ENH: Add coverage to Nextclade output?
Coverage
is normally defined as number of valid called bases divided by length of the virus.
Does Nextclade have a column for this?
The only way I can seem to compute it is to use
- totalMissing
- totalNonACGTNs
- alignmentStart
- alignmentEnd along with the reference length.
But this seems error-prone and messy.
Is there an easier way?
Good point, we don't have a column for this (yet).
Given that coverage is a metric that may be of interest and involves 5 numbers we may want to add that to the tsv output.
I'll turn this into an enhancement proposal.
So right now we count:
- Ns
- ambiguous nucleotides
- inserted nucleotides
- deleted nucleotides
It would make sense to also count:
- sequenced nucleotides minus inserted nts plus deleted nucleotides
So that we know the total number of aligned bases. Coverage could then be calculated as that number divided by reference length.