funannotate icon indicating copy to clipboard operation
funannotate copied to clipboard

lacks elements in /scratch/mdegenna/slin023/funannotate/dikarya/prfl

Open maysonlin opened this issue 2 years ago • 1 comments

Hi, I used funannotate 1.8.3 to run the genome annotation. But I ran into these errors while running funannotate predict :

`WARNING The dataset you provided does not contain the file dataset.cfg, likely because it is an old version. Some parameters will be deduced from the dataset folder name
INFO    The lineage dataset is: dikarya (eukaryota)
INFO    Mode is: genome
INFO    Maximum number of regions limited to: 3
INFO    To reproduce this run: python /funannotate/lib/python2.7/site-packages/funannotate/aux_scripts/funannotate-BUSCO2-py2.py -i /home/slin023/funannotate/funannotateout/predict_misc/genome.softmasked.fa -o phormia_regina -l /scratch/mdegenna/slin023/funannotate/dikarya/ -m genome -c 2 -sp anidulans
INFO    Check dependencies...
INFO    Check input file...
ERROR   The dataset you provided lacks elements in /scratch/mdegenna/slin023/funannotate/dikarya/prfl

ERROR   BUSCO analysis failed !
INFO    Check the logs, read the user guide, if you still need technical support, then please contact mailto:[email protected]`

Here is the command I used to run funannotate predict:

`module load singularity-3.8.2

pwd; hostname; date
 
echo "Running program on $SLURM_CPUS_ON_NODE CPU cores"


singularity exec --env-file /home/slin023/funannotate.txt /home/slurmsample/singularity/funannotate_1.8.3.sif funannotate predict -i /home/slin023/funannotate/masked.fasta --species "Phormia regina" --transcript_evidence  /scratch/mdegenna/slin023/Transannotation/Trinity-GG.fasta --rna_bam /home/data/FLAG/PhormiaMaggotAging/aging/TrimMapping/trimSorted.out.bam -o funannotateout`

I referred to this thread, and used funannotate setup -b dikarya , but still got the same issue. Is there anyway to manual install the missing dataset, or it indicates the dataset.cf is missing. If there is any suggestion, please let me know.

maysonlin avatar Aug 05 '22 00:08 maysonlin

This appears to be an incomplete download of the BUSCO dataset, ie here is what should be in the dikarya folder

$ ls -l $FUNANNOTATE_DB/dikarya
total 10624
-rw-r--r--@    1 jon  admin   472K Feb 13  2017 ancestral
-rw-r--r--@    1 jon  admin   4.6M Feb 13  2017 ancestral_variants
-rw-r--r--@    1 jon  admin   132B Feb 13  2017 dataset.cfg
drwxr-xr-x  1314 jon  admin    41K Feb 13  2017 hmms/
drwxr-xr-x     6 jon  admin   192B Feb 13  2017 info/
-rw-r--r--@    1 jon  admin    47K Feb 13  2017 lengths_cutoff
drwxr-xr-x  1314 jon  admin    41K Feb 13  2017 prfl/
-rw-r--r--@    1 jon  admin    31K Feb 13  2017 scores_cutoff

nextgenusfs avatar Aug 09 '22 21:08 nextgenusfs

Hi, thank you for the reply. I solved the problem by manually downloading the dikarya database from "busco-data.ezlab.org"!

maysonlin avatar Aug 13 '22 04:08 maysonlin