fusioncatcher
fusioncatcher copied to clipboard
Wrong amino acid sequence for fusion protein
Hi Daniel,
We noted an an instance in which FusionCatcher reported a long sequence of Ks in the predicted fusion protein. When we tried to reconstruct the protein sequence we got a different protein.
This is the sequence of the fusion protein that Fusion Catcher predicts:
KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKCGHCKRLAPEYEAAATRLKGIVPLAKVDCTANTNTCNKYGVSGYPTLKIFRDGEEAGAYDGPRTADGIVSHLKKQAGPASVPLRTEEEFKKFISDKDASIVGFFDDSFSEAHSEFLKAASNLRDNYRFAHTNVESLVNEYDDNGEGIILFRPSHLTNKFEDKTVAYTEQKMTSGKIKKFIQENIFGICPHMTEDNKDLIQGKDLLIAYYDVDYEKNAKGSNYWRNRVMMVAKKFLDAGHKLNFAVASRKTFSHELSDFGLESTAGEIPVVAIRTAKGEKFVMQEEFSRDGKALERFLQDYFDGNLKRYLKASGGDRAGARAAPADRVAERPALGTGTGSLLGLPALGADTV
And this is the line from the final-list_candidate-fusion-genes_sequences.txt
file:
Gene_1_symbol(5end_fusion_partner) Gene_2_symbol(3end_fusion_partner) Fusion_description Counts_of_common_mapping_reads Spanning_pairs Spanning_unique_reads Longest_anchor_found Fusion_finding_method Fusion_point_for_gene_1(5end_fusion_partner) Fusion_point_for_gene_2(3end_fusion_partner) Gene_1_id(5end_fusion_partner) Gene_2_id(3end_fusion_partner) Exon_1_id(5end_fusion_partner) Exon_2_id(3end_fusion_partner) Fusion_sequence Predicted_effect Predicted_fused_transcripts Predicted_fused_proteins
PDIA3 APOE m3 0 4 6 40 BOWTIE+BLAT;BOWTIE+STAR 15:43768557:+ 19:44907778:+ ENSG00000167004 ENSG00000130203 GAGAGGTTCCTGCAGGATTACTTTGATGGCAATCTGAAGAGATACCTGAA*AGCAAGCGGTGGAGACAGAGCCGGAGCCCGAGCTGCGCCAGCAGACCGAG out-of-frame ENST00000300289:1245/ENST00000252486:173;ENST00000300289:1245/ENST00000434152:197;ENST00000300289:1245/ENST00000446996:151;ENST00000300289:1245/ENST00000425718:327 KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKCGHCKRLAPEYEAAATRLKGIVPLAKVDCTANTNTCNKYGVSGYPTLKIFRDGEEAGAYDGPRTADGIVSHLKKQAGPASVPLRTEEEFKKFISDKDASIVGFFDDSFSEAHSEFLKAASNLRDNYRFAHTNVESLVNEYDDNGEGIILFRPSHLTNKFEDKTVAYTEQKMTSGKIKKFIQENIFGICPHMTEDNKDLIQGKDLLIAYYDVDYEKNAKGSNYWRNRVMMVAKKFLDAGHKLNFAVASRKTFSHELSDFGLESTAGEIPVVAIRTAKGEKFVMQEEFSRDGKALERFLQDYFDGNLKRYLKASGGDRAGARAAPADRVAERPALGTGTGSLLGLPALGADTV;KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKCGHCKRLAPEYEAAATRLKGIVPLAKVDCTANTNTCNKYGVSGYPTLKIFRDGEEAGAYDGPRTADGIVSHLKKQAGPASVPLRTEEEFKKFISDKDASIVGFFDDSFSEAHSEFLKAASNLRDNYRFAHTNVESLVNEYDDNGEGIILFRPSHLTNKFEDKTVAYTEQKMTSGKIKKFIQENIFGICPHMTEDNKDLIQGKDLLIAYYDVDYEKNAKGSNYWRNRVMMVAKKFLDAGHKLNFAVASRKTFSHELSDFGLESTAGEIPVVAIRTAKGEKFVMQEEFSRDGKALERFLQDYFDGNLKRYLKASGGDRAGARAAPADRVAERPALGTGTGSLLGLPALGADTV;KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKCGHCKRLAPEYEAAATRLKGIVPLAKVDCTANTNTCNKYGVSGYPTLKIFRDGEEAGAYDGPRTADGIVSHLKKQAGPASVPLRTEEEFKKFISDKDASIVGFFDDSFSEAHSEFLKAASNLRDNYRFAHTNVESLVNEYDDNGEGIILFRPSHLTNKFEDKTVAYTEQKMTSGKIKKFIQENIFGICPHMTEDNKDLIQGKDLLIAYYDVDYEKNAKGSNYWRNRVMMVAKKFLDAGHKLNFAVASRKTFSHELSDFGLESTAGEIPVVAIRTAKGEKFVMQEEFSRDGKALERFLQDYFDGNLKRYLKASGGDRAGARAAPADRVAERPALGTGTGSLLGLPALGADTV;KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKCGHCKRLAPEYEAAATRLKGIVPLAKVDCTANTNTCNKYGVSGYPTLKIFRDGEEAGAYDGPRTADGIVSHLKKQAGPASVPLRTEEEFKKFISDKDASIVGFFDDSFSEAHSEFLKAASNLRDNYRFAHTNVESLVNEYDDNGEGIILFRPSHLTNKFEDKTVAYTEQKMTSGKIKKFIQENIFGICPHMTEDNKDLIQGKDLLIAYYDVDYEKNAKGSNYWRNRVMMVAKKFLDAGHKLNFAVASRKTFSHELSDFGLESTAGEIPVVAIRTAKGEKFVMQEEFSRDGKALERFLQDYFDGNLKRYLKASGGDRAGARAAPADRVAERPALGTGTGSLLGLPALGADTV
Here is a link to the output dir of fusion catcher:
https://owncloud.incpm.weizmann.ac.il/owncloud/index.php/s/GnMFv4lELikNByb
Thank you,
Gil & Rotem
Thanks for the bug report! I will take a look!
When you reconstructed the protein sequence did you use the gene annotation from Ensembl release 90 (which is used by FusionCatcher v1.00) or something else?
Cheers, Daniel
We took notice to use the same Ensembl release as FusionCatcher.
ok. Thanks!
So, finally I have had some time for looking into this but now the link above is broken. I am not sure if this can be fixed without it.