ncov icon indicating copy to clipboard operation
ncov copied to clipboard

BUG: Nucleotide and corresponding amino acid mutation occasionally on different branch

Open corneliusroemer opened this issue 2 years ago • 3 comments

Current Behavior

It sometimes happens that nt and the corresponding aa mutation are not on the same branch

Expected behavior

Nucleotide and the corresponding amino acid mutations (if non-synonymous) should always be on the same branch

How to reproduce

Don't have reproducible input files yet, if anyone finds a build where this happens again please add input files (if shareable)

Possible solution

No idea, seems non-trivial as the root of the problem is that amino acids and nucleotides are reconstructed independently by treetime.

If the problem is not solvable, it would be good to explain in this issue what the preconditions are for the problem to show.

Evidence

Note that the ORF3a:78 mutation and the corresponding nt mutation at position 25624 are not on the same branch: image (13)

corneliusroemer avatar Mar 01 '22 16:03 corneliusroemer

I think this will be a ncov-specific bug, right? For most pipelines we infer ancestral nuc mutations via augur ancestral and then translate those node-by-node via augur translate. However for ncov the second step is switched out for scripts/explicit_translation.py which uses the translations from nextalign/nextclade, and doesn't consider the output of augur ancestral.

jameshadfield avatar Mar 01 '22 20:03 jameshadfield

I think @jameshadfield is right on here about this being an ncov-specific problem. Amino acid mutations come from the translated alignments that happen from nextalign and then ancestral state reconstruction from those translations while the nucleotide mutations come from augur ancestral's inference of ancestral sequences from the nucleotide alignment. It's easy to imagine how one could get different ancestral state reconstructions from these different inputs to TreeTime.

I would transfer this issue to the ncov repo, where we could consider how to fix it in that context.

huddlej avatar Mar 01 '22 20:03 huddlej

I see! So the actual augur translate makes sure there is a link by doing the translation on reconstructed nucs, therefore only reconstructing once, while ncov reconstructs twice, now, and that's where the link is broken.

It's worth noting that I encountered this in ncov-simple builds. But since ncov uses (almost) the same script, the bug should also be presented there, just unnoticed so far.

I agree that transfer makes sense then.

corneliusroemer avatar Mar 01 '22 20:03 corneliusroemer