PHANOTATE
PHANOTATE copied to clipboard
Different outputs between 1.5.1 and 1.6.1. Also, odd hash symbols.
Hello,
We are noticing that the ORFs called in 1.5.1 are different than 1.6.1. Also, we are seeing +, *, and # randomly being put in sequences in 1.6.1. I assume that the * are stop codons? What are the + and # symbols?
For example:
J02459.1_CDS_[complement(25396..26973)] [note=score:-7.092482E+07] YNLKSQ#LSPPI#GRHLFLFRKDARLPTLLIRKQCYKIEQKLLVPLFTL+PLQF#PKHPVFLLHSISNYLFFKSVEGLYKKRNPLS#LMIQLESFNSLPKFVLEEFR#VLNHHTGLANRL#LLLSQELWSNESPHESL#VPVKRAHNLPNGLNSLKVKSKLSFSRGPKSLNYGKLHCQLRYGSCL#VRK+PALPTPFCC#NLPY#PR+WLGLLLLIKKRLLLSE+LSS+GFPIKASEQSQSKSRKLGKGGFLVGLGRFPCVMKI#PEFLRRSLSNFSDPLEANLK#SRK#PTYSHLSFSVETELR+FFDSSLLLKSLYLE+SE#RFILLNSALLGV#VHLLLFKFNKILL+VFIDEAYSRPVR#QNK*+CSNHLQQPLFSNQNKLLGLQVDVGGNERRKNACNSLNELRALPHRY#RVRGHHDVLQGFRTYTFQDASSLRYL#+AGL#LCKPLLNPQNALHKNELHCLRPMVVNNSVRQLSLERILW#DFLILPVYPNLPAWQNFRYYYLSLLPFHVT
Hmmer and diamond aren't likely the # symbols. Do we filter base on score? Can we remove them? Are they stop codons? What are they?
many thanks, Rick
oh, the + and # are the symbols for their respective stop codons. If phanotate is returning internal stop codons then there must be an offset by one bug in the code. Let me try to track it down
I get the correct translations when I use the newest 1.6.3 version. I think I fixed some bugs in the earlier version that caused incorrect translations. $ phanotate.py J02459.1.fasta -f faa | head -n 64 | tail -n 2
>J02459.1_CDS_[complement(25396..26973)] [note=score:-7.092482E+07]
MLEFSVIERGGYIPAVEKNKAFLRADGWNDYSFVTMFYLTVFDEHGEKCDIGNVKIGFVGQKEEVSTYSLIDKKFSQLPEMFFSLGESIDYYVNLSKLSDGFKHNLLKAIQDLVVWPNRLADIENESVLNTSLLRGVTLSEIHGQFARVLNGLPELSDFHFSFNRKSAPGFSDLTIPFEVTVNSMPSTNIHAFIGRNGCGKTTILNGMIGAITNPENNEYFFSENNRLIESRIPKGYFRSLVSVSFSAFDPFTPPKEQPDPAKGTQYFYIGLKNAASNSLKSLGDLRLEFISAFIGCMRVDRKRQLWLEAIKKLSSDENFSNMELISLISKYEELRRNEPQIQVDDDKFTKLFYDNIQKYLLRMSSGHAIVLFTITRLVDVVGEKSLVLFDEPEVHLHPPLLSAFLRTLSDLLDARNGVAIIATHSPVVLQEVPKSCMWKVLRSREAINIIRPDIETFGENLGVLTREVFLLEVTNSGYHHLLSQSVDSELSYETILKNYNGQIGLEGRTVLKAMIMNRDEGKVQ*