funannotate
funannotate copied to clipboard
The format of protein evidence
Hi Jon Palmer
When I use funannotate predict
function with command --protein_evidence A.protein.fasta
. But, there is an error indicated that A.protein.fasta is not valid, existing
. And the A.protein.fasta
was downloaded from NCBI. So how to use the correct closely_related.fasta
from the example you described on https://funannotate.readthedocs.io/en/lastest/evidence.html
BTW, the reason I want to use --protein_evidence
is that I found one gene was not predicted, which existing in .gff on NCBI. Therefore, I doubt that only uniprot_sprot.fasta is not enough.
Looking forward your reply Best regards!
Seems like either you aren't specifying the path to that file correctly, it is telling you the file doesn't exist in how you passed it on the command line. Or it is not a FASTA format and unable to be opened by biopython.
Adding protein evidence is unlikely to change a single gene prediction.
Hi Jon Palmer
Finally, I solve the problem. Due to the illegal header of the .fasta
file, it will show ValueError: invalid literal for int() with base 10"
BTW, when I only use two protein .fasta
files from NCBI to predict genes, the target gene I mentioned before was correctly predicted. So whether running with species_specific protein library instead of only using uniprot_sprot.fasta
is a better way to predict?
Hi @Niohuruzh. As with most everything, it depends on what you are trying to do. If you are trying to liftover genes from perhaps a public annotation from the same species to a new isolate -- than don't use funannotate, use something like Liftoff.
For de novo annotation with funannotate, generally you should not use protein models from existing annotations unless there is experimental evidence for those gene models (this is why the default is to use uniprot/swissprot). The reason is that those predictions were most likely made with similar gene prediction algorithms/software -- which are of course a prediction and not actual evidence. So you should not reinforce selection of gene models based on other ab initio predictions -- just because a computer predicted a gene model 10 years ago in your organism of interest, doesn't mean it is actually correct unless it has been experimentally validated.
Hi Jon Palmer
Thanks for your help. Now I totally understand what funannotate
is good at. And I'll try Lifttoff
to annotate the gene.
Have a good day Best regards!