nextclade icon indicating copy to clipboard operation
nextclade copied to clipboard

Unintuitive: Truncated peptides have deletions instead of `X` in alignment

Open corneliusroemer opened this issue 2 years ago • 3 comments

When a gene starts or ends with Ns, a user might expect that that section of the gene is output as X instead of - as we actually do (see below)

This seems to be the root cause of a bug in covSpectrum: https://github.com/cevo-public/cov-spectrum-website/issues/398

It's possible of course for users to convert starts and ends with - into X but I feel like this should be done by Nextalign.

We seem to call the truncated parts as deletions. Is that on purpose? It would feel more appropriate to call them X I feel. It's much more likely that a partial gene was uploaded than that this is a real deletion.

image

Originally posted by @corneliusroemer in https://github.com/nextstrain/nextclade/issues/731#issuecomment-1039339677

If we decide not to change the output, we may want to make this clearer in the docs.

This issue is related to #730 but with a focus on the file output rather than the web view. Switching from - to X would automatically solve #730 I think

corneliusroemer avatar Feb 18 '22 11:02 corneliusroemer

Hi. I was just wondering how you are prioritizing this issue. Do you plan to fix it any time soon?

chaoran-chen avatar Mar 11 '22 20:03 chaoran-chen

These are the huge deletions relative to the reference, and this is how it comes out of alignment. Same thing for nucs in #730.

I am not sure why you guys decided it should be N nucs or X aminoacids. Have you just invented that randomly, or do you know examples of tools, or any community agreement that incomplete fragments should be handled this way? What comes out of mafft and other tools if you feed them your examples?

ivan-aksamentov avatar Mar 12 '22 20:03 ivan-aksamentov

@chaoran-chen I don't think we'll fix this soon within Nextclade, it's not a simple error that can be fixed in a few lines of code unfortunately.

corneliusroemer avatar Mar 13 '22 23:03 corneliusroemer