funannotate
funannotate copied to clipboard
Diamond blastp database error
Are you using the latest release? funannotate 1.8.9
Describe the bug
I am running funannotate predict and I am getting an error about the diamond blastp database being incomplete. I previously got the same error message for the diamond blastx database, so I updated my databases with funannotate setup -u -d /home/lily/funannotate_db and tested the setup with funannotate test -t predict which didn't return any errors. When I re-ran funannotate predict I no longer got the blastx error but got a new blastp error:
CMD ERROR: diamond blastp --query augustus.training.proteins.fa --db aug_training.dmnd --more-sensitive -o aug.blast.txt -f 6 qseqid sseqid pident --query-cover 80 --subject-cover 80 --id 80 --no-self-hits
b'diamond v2.0.11.149 (C) Max Planck Society for the Advancement of Science\nDocumentation, support and updates available at http://www.diamondsearch.org\nPlease cite: http://dx.doi.org/10.1038/s41592-021-01101-x Nature Methods (2021)\n\n#CPU threads: 48\nScoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)\nTemporary directory: \n#Target sequences to report alignments for: 25\nOpening the database... [0s]\nError: Incomplete database file. Database building did not complete successfully.\n'
What command did you issue?
nohup funannotate predict -i cleaned_output/repeatmasked/Fxyl563.pilon.reordered.fasta_clean_sort -o fun_out.563 -d /home/lily/funannotate_db --isolate 389563 --species "Fusarium xylarioides" --pasa_gff Fusve2_ExternalModels_2021-10-20.gff3 --rna_bam bbmap/563_1C_Fxyl5
[funannotate_check.versions.txt](https://github.com/nextgenusfs/funannotate/files/9130707/funannotate_check.versions.txt)
63.pilon_sorted.bam --busco_seed_species fusarium_graminearum --protein_evidence Fver.faa $FUNANNOTATE_DB/uniprot_sprot.fasta --transcript_evidence trinity_output.Trinity.fasta > fun_out/fun_out.563/run.out.563 2>&1 &
Logfiles Attached
OS/Install Information Attached Funannotate_blastx_error.txt funannotate-predict.log [funannotate-p2g.log](https://github.com/nextgenusfs/fu fun_show.versions.txt nannotate/files/9130668/funannotate-p2g.log) run.out.563.txt
Seems like it failed extracting gene models from the GFF file you passed as PASA results. Was that PASA/transdecoder GFF3 file directly or something else?
Ah right, so not a diamond problem. That is a PASA gff3 file downloaded from mycocosm for a closely-related species, so I could just remove it and re-run. Will report back, thanks very much.
You can pass external gene predictions to --other_gff, but they need to be proper gene predictions on your actual genome assembly.
Removing the --pasa_gff flag worked, but I am now getting a different error with CodingQuarry. Do you know what might be causing this? Both stringtie.gff3 and genome.softmasked.fa are in /home/lily/funannotate/fun_out/predict_misc
Full log file attached
[Jul 19 09:45 AM]: Running CodingQuarry prediction using stringtie alignments
[Jul 19 09:49 AM]: CMD ERROR: CodingQuarry -p 2 -f /home/lily/funannotate/fun_out/predict_misc/genome.softmasked.fa -t /home/lily/funannotate/fun_out/predict_misc/stringtie.gff3
The output from running the CodingQuarry command is below, which I guess is the problem? I've googled it and it seems to be an augustus issue..do you know how I can work around it?
Thanks in advance!
CodingQuarry -p 2 -f /home/lily/funannotate/fun_out/predict_misc/genome.softmasked.fa -t /home/lily/funannotate/fun_out/predict_misc/stringtie.gff3
The environmental variable QUARRY_PATH is set as: /home/jp/miniconda2/envs/funannotate/opt/codingquarry-2.0/QuarryFiles
STAGE 1
Prediction from transcript sequences, run 1 of 2...
Prediction from transcript sequences, run 2 of 2...
STAGE 1 complete
Training..
Intron training complete
printed parameters
Remove single-exon predictions
Select prediction regions
Predict from genome sequences
Segmentation fault (core dumped)
its hard to debug it specifically unless CQ gives you more details - I assume if you run CQ outside of funannotate you still see the error message. your genome sequence is definitely complete and matches what the stringtie gff points to. without more specifics in the error message I wouldn't know what is tripping up CQ...
hi just to update this i used singularity to pull the docker image and everything has worked perfectly