emapper.py has multiple problems
-
The instructions state that basic usage is as follows:
emapper.py -i FASTA_FILE_PROTEINS -o testThis is not correct if you download the databases to a directory other than the eggnog-mapper directory. -
The instructions imply that eggnog-mapper will query the EGGNOG_DATA_DIR environment variable. It does not.
(base) [user@mcc-login001 user]$ echo $EGGNOG_DATA_DIR /scratch/user/tmp
(base) [user@mcc-login001 farman]$ ls $EGGNOG_DATA_DIR/ e5.proteomes.faa e5.taxid_info.tsv eggnog.db eggnog_proteins.dmnd eggnog.taxa.db eggnog.taxa.db.traverse.pkl fungi.dmnd
(base) [user@mcc-login001 user]$ singularity run --app eggnogmapper2112 /share/singularity/images/ccs/conda/amd-conda22-rocky9.sinf emapper.py -i SSproteins.fasta -o MyProteinAnnotations
DIAMOND database /usr/local/Miniconda3/envs/eggnog-mapper-2.1.12/lib/python3.11/site-packages/data/eggnog_proteins.dmnd not present. Use download_eggnog_database.py to fetch it
- If you point explicitly to the to the diamond database that is to be used, it will throw another error:
(base) [user@mcc-login001 farman]$ singularity run --app eggnogmapper2112 /share/singularity/images/ccs/conda/amd-conda22-rocky9.sinf emapper.py -i SSproteins.fasta --dmnd_db tmp/fungi.dmnd -o MyProteinAnnotation
Annotation database data/eggnog.db not present. Use download_eggnog_database.py to fetch it
-
There doesn't appear to be any option flag that allows one to specify the path to the annotation file
-
The process does not die on failure to detect a database. SLURM jobs keep on running until timeout.
-
Jobs run on the login node run forever, open empty files and empty directories and never write to them. No error messages are produced.
-
Jobs run in slurm fail to start any processes, keep on running forever, and throw no error messages.
My custom DIAMOND database is also in a nonstandard , and I fixed the "DIAMOND database [name].dmnd not present. Use download_eggnog_database.py to fetch it" issue by changing a single line (line 49) in the eggnogmapper/search/search_modes.py file.
Under def get_eggnog_dmnd_db, change this:
if dmnd_db is not None:
ret_dmnd_db = dmnd_db
to this:
if dmnd_db is not None:
ret_dmnd_db = pjoin(data_path, dmnd_db)
From there, running emapper.py -i [protein fasta] --output_dir [output directory] -m diamond --dbname [path to eggnog db dir] --dmnd_db fungi.dmnd appears to work