atlas
atlas copied to clipboard
Error in rule DRAM_annotate
Hi Silas,
I need your help identifying a missing database while running rule DRAM_annotate
. I am using atlas on a computing node that don't have access to internet, so I had to download the databases manually before running atlas on the cluster, but I think I must've missed something.
Here's the error:
1 fastas found
2022-06-05 19:15:56.543411: Annotation started
Traceback (most recent call last):
File "/scratch/c/croitoru/mbareche/MAGs_Discovery/databases/conda_envs/dff5f2d90a09bc6b4f1e0a73d11619be/bin/DRAM.py", line 168, in <module>
args.func(**args_dict)
File "/scratch/c/croitoru/mbareche/MAGs_Discovery/databases/conda_envs/dff5f2d90a09bc6b4f1e0a73d11619be/lib/python3.10/site-packages/mag_annotator/annotate_bins.py", line 967, in annotate_bins_cmd
annotate_bins(fasta_locs, output_dir, min_contig_size, prodigal_mode, trans_table, bit_score_threshold,
File "/scratch/c/croitoru/mbareche/MAGs_Discovery/databases/conda_envs/dff5f2d90a09bc6b4f1e0a73d11619be/lib/python3.10/site-packages/mag_annotator/annotate_bins.py", line 1001, in annotate_bins
db_handler = DatabaseHandler(db_locs['description_db'])
File "/scratch/c/croitoru/mbareche/MAGs_Discovery/databases/conda_envs/dff5f2d90a09bc6b4f1e0a73d11619be/lib/python3.10/site-packages/mag_annotator/database_handler.py", line 17, in __init__
raise ValueError('Database does not exist at path %s' % database_loc)
ValueError: Database does not exist at path None
Can you help me identify which file database_handler.py
is looking to get and where it's suppose to be?
Thank you so much.
Best, Hamza
Did you skip the checkM in the meantime?
I suggest you to run Atlas without the DRAM annotations. And check the quantification results (see tutorial).
There is a solution to import the dram databases to Atlas. I come back to you.
I was able to run Atlas on a cluster so the checkM part is not an issue anymore.
I'll try your suggestion and report back. Thanks!
Hi Silas,
I am having the exact same issue.
George
So the problem is that you cannot download the database on the cluster node, but the main node had internet access.
Then you can simply run
atlas run None dram_download
no profile argument. atlas will only do this step.
If this doesn't work or you have already downloaded the Dram databases you can follow the following steps.
Download DRam database independent of atlas:
- Install Dram or activate the atlas dram environment.
- Download Dram databases
DRAM-setup.py prepare_databases "
--output_dir {dram_dbdir} "
--threads 4
--skip_uniref
- Export dram config file.
DRAM-setup.py export_config --output_file dram_cofnig_file.txt
- add the path to the dram config file to that atlas config.yaml
dram_config_file : path/to/dram_cofnig_file.txt
- Run atlas, but do a dry-ran first to check that atlas skips the
dram_download
rule.
Note to me:
- [ ] dram_download would be part of
atlas download
Hello,
I had encountered the same problem. While I was trying this solution I was removed from the terminal.
So the problem is that you cannot download the database on the cluster node, but the main node had internet access.
Then you can simply run
atlas run None dram_download
no profile argument. atlas will only do this step.
If this doesn't work or you have already downloaded the Dram databases you can follow the following steps.
Download DRam database independent of atlas:
- Install Dram or activate the atlas dram environment.
- Download Dram databases
DRAM-setup.py prepare_databases " --output_dir {dram_dbdir} " --threads 4 --skip_uniref
- Export dram config file.
DRAM-setup.py export_config --output_file dram_cofnig_file.txt
- add the path to the dram config file to that atlas config.yaml
dram_config_file : path/to/dram_cofnig_file.txt
- Run atlas, but do a dry-ran first to check that atlas skips the
dram_download
rule.
I am not sure that my solution is official but It worked for me.
Firstly, I deleted the DRAM environment which is downloaded by ATLAS, then replaced the /atlas/workflow/envs/dram.yaml's
dependencies are as follows: ( I got the information from https://github.com/WrightonLabCSU/DRAM/blob/master/environment.yaml)
dependencies:
- python>=3.6
- pandas
- pytest
- scikit-bio
- prodigal
- mmseqs2!=10.6d92c
- hmmer!=3.3.1
- trnascan-se >=2
- scipy!=1.9.0
- sqlalchemy
- barrnap
- altair >=4
- openpyxl
- networkx
- ruby
- parallel
- dram
And I run this command, due to the fact that It wants a huge amount of memory, I had to run on my cluster.
In my opinion, you can ignore snakemake
arguments. But I had to do this
atlas run None dram_download --conda-frontend mamba --profile cluster
When It is finished, I activated the installed DRAM environment and I run these commands following:
DRAM-setup.py --update_description_db
Finally, I continued the atlas pipeline.
In my case, I successfully passed the DRAM step. I don't know if this is the official solution. @SilasK might clarify that.
Further reads https://github.com/WrightonLabCSU/DRAM/issues/191 https://github.com/WrightonLabCSU/DRAM/issues/188
@omrctnr It worked for you isn't?
@SilasK yes exactly.