arriba icon indicating copy to clipboard operation
arriba copied to clipboard

possibly incorrect determination of the frame

Open anoronh4 opened this issue 1 year ago • 5 comments

We have a fusion called SCP2-NTRK1 and we are curious as to why the fusion is being called as out of frame. The nt sequence as reported by arriba is as follows:

ATAATTTAGGCATTGGAGGAGCTGTGGTTGTAACACTCTACAAGATGGGTTTTCCGGAAGCCGCCAG___TTCTTTTAGAACTCATCAAATTGAAGCTGTTCCAACCAGCTCTGCAAGTGATGGATTTAAGGCAAATCTTGTTTTTAAGGAGATTGAGAAGAAACTTGAAGAG___TCTCGCTCTGTCACCCAGGCTGGAGTGCAGTGGCGCGATCTTGGCTCACTGCAACCTCCACCTCCCGGGTTCAAGCGATTCTCGTGCCTCAGCCTCCCGAGTAGCTGGCATTACAGGCACGTGCCACCACACCCAG|GACGGAGAAACAAGTTTGGGATCAACC___GCCCGGCTGTGCTGGCTCCAGAGGATGGGCTGGCCATGTCCCTGCATTTCATGACATTGGGTGGCAGCTCCCTGTCCCCCACCGAGGGCAAAGGCTCTGGGCTCCAAGGCCACATCATCGAGAACCCACAATACTTCAGTGATGCCTGTGAGGGGCTATGCTGGG...CAAGGGCAGGGACGA...___GTGTTCACCACATCAAGCGC

The sequence between the breakpoint | the last ___ before it is actually intron sequence (site1 == intron and site2 == CDS). The aa sequence that arriba predicts is as follows:

NLGIGGAVVVTLYKMGFPEAASSFRTHQIEAVPTSSASDGFKANLVFKEIEKKLEEsrsvtqagvqwrdlgslqppppgfkrfsclslpsswhyrhvpphp|grrnkfginrpavlapedglamslhfmtlggsslsptegkgsglqghiienpqyfsdaceglcw

we are a bit confused because the protein sequence after the breakpoint looks to be in-frame. so i'm just wondering what about this causes arriba to say it is out-of-frame.

by the way, we are using 2.3.0

anoronh4 avatar Mar 17 '23 19:03 anoronh4