BRAKER
BRAKER copied to clipboard
Prothint - Error: Inflate error
Dear authors,
I got the error at prothint step.
# Thu Mar 21 12:48:30 2024: Calling prothint.py...
# Thu Mar 21 12:48:30 2024: starting prothint.py
/data/scratch/mpx586/github/gene_predict/ProtHint/bin//prothint.py --threads=2 --geneMarkGtf /data/scratch/mpx586/Batesia_hypochlora/RNA/braker3/braker1/GeneMark-ES/genemark.gtf /data/scratch/mpx586/Batesia_hypochlora/RNA/braker3/braker1/genome.fa /data/scratch/mpx586/Batesia_hypochlora/RNA/braker3/braker1/proteins.fa
Here is warning message from beginning
WARNING: empty line was removed! This warning will be supressed from now on!
#*********
# Wed Mar 20 14:18:30 2024: check_fasta_headers(): Checking fasta headers of file /data/scratch/mpx586/Batesia_hypochlora/RNA/braker3/orthodb/Arthropoda.fa.gz
#*********
# WARNING: Detected whitespace in fasta header of file /data/scratch/mpx586/Batesia_hypochlora/RNA/braker3/orthodb/Arthropoda.fa.gz. This may later on cause problems! The pipeline will create a new file without spaces or "|" characters and a genome_header.map file to look up the old and new headers. This message will be suppressed from now on!
#*********
#*********
# WARNING: Detected | in fasta header of file /data/scratch/mpx586/Batesia_hypochlora/RNA/braker3/orthodb/Arthropoda.fa.gz. This may later on cause problems! The pipeline will create a new file without spaces or "|" characters and a genome_header.map file to look up the old and new headers. This message will be suppressed from now on!
#*********
#*********
WARNING: empty line was removed! This warning will be supressed from now on!
#*********
# Wed Mar 20 14:18:30 2024: Assuming that this is not a DNA fasta file because other characters than A, T, G, C, N, a, t, g, c, n were contained. If this is supposed to be a DNA fasta file, check the content of your file! If this is supposed to be a protein fasta file, please ignore this message!
# Wed Mar 20 14:18:30 2024: Assuming that this is not a protein fasta file because other characters than AaRrNnDdCcEeQqGgHhIiLlKkMmFfPpSsTtWwYyVvBbZzJjOoUuXx were contained. If this is supposed to be DNA fasta file, please ignore this message.
#*********
# WARNING: something seems to be wrong with the newline character! This is likely to cause problems with the braker.pl pipeline! Please adapt your file to UTF8! This warning will be supressed from now on!
Note:
- Inputs: I used a soft masked genome, Arthropoda protein database I downloaded from https://bioinf.uni-greifswald.de/bioinf/partitioned_odb11/Arthropoda.fa.gz
- Genemark worked, and the output of gmes_petap.pl is genemark.gtf (11.29 Mb)
- Prothint (version 2.6)
Could you please take a look and let me know how can I solve this problem. Thank you very much
I found the issue: using "Arthropoda.fa.gz"
Protein need to be unzipped before running.
Please close the question. Thank you very much