atlas icon indicating copy to clipboard operation
atlas copied to clipboard

Error in rule DRAM_annotate

Open HamzaMbareche opened this issue 2 years ago • 5 comments

Hi Silas,

I need your help identifying a missing database while running rule DRAM_annotate. I am using atlas on a computing node that don't have access to internet, so I had to download the databases manually before running atlas on the cluster, but I think I must've missed something.

Here's the error:

1 fastas found
2022-06-05 19:15:56.543411: Annotation started
Traceback (most recent call last):
  File "/scratch/c/croitoru/mbareche/MAGs_Discovery/databases/conda_envs/dff5f2d90a09bc6b4f1e0a73d11619be/bin/DRAM.py", line 168, in <module>
    args.func(**args_dict)
  File "/scratch/c/croitoru/mbareche/MAGs_Discovery/databases/conda_envs/dff5f2d90a09bc6b4f1e0a73d11619be/lib/python3.10/site-packages/mag_annotator/annotate_bins.py", line 967, in annotate_bins_cmd
    annotate_bins(fasta_locs, output_dir, min_contig_size, prodigal_mode, trans_table, bit_score_threshold,
  File "/scratch/c/croitoru/mbareche/MAGs_Discovery/databases/conda_envs/dff5f2d90a09bc6b4f1e0a73d11619be/lib/python3.10/site-packages/mag_annotator/annotate_bins.py", line 1001, in annotate_bins
    db_handler = DatabaseHandler(db_locs['description_db'])
  File "/scratch/c/croitoru/mbareche/MAGs_Discovery/databases/conda_envs/dff5f2d90a09bc6b4f1e0a73d11619be/lib/python3.10/site-packages/mag_annotator/database_handler.py", line 17, in __init__
    raise ValueError('Database does not exist at path %s' % database_loc)
ValueError: Database does not exist at path None

Can you help me identify which file database_handler.py is looking to get and where it's suppose to be?

Thank you so much.

Best, Hamza

HamzaMbareche avatar Jun 06 '22 16:06 HamzaMbareche

Did you skip the checkM in the meantime?

I suggest you to run Atlas without the DRAM annotations. And check the quantification results (see tutorial).

There is a solution to import the dram databases to Atlas. I come back to you.

SilasK avatar Jun 07 '22 20:06 SilasK

I was able to run Atlas on a cluster so the checkM part is not an issue anymore.

I'll try your suggestion and report back. Thanks!

HamzaMbareche avatar Jun 08 '22 02:06 HamzaMbareche

Hi Silas,

I am having the exact same issue.

George

gbouras13 avatar Jun 19 '22 01:06 gbouras13

So the problem is that you cannot download the database on the cluster node, but the main node had internet access.

Then you can simply run

atlas run None dram_download

no profile argument. atlas will only do this step.

If this doesn't work or you have already downloaded the Dram databases you can follow the following steps.

Download DRam database independent of atlas:

  1. Install Dram or activate the atlas dram environment.
  2. Download Dram databases
DRAM-setup.py prepare_databases "
 --output_dir {dram_dbdir} "
 --threads 4 
 --skip_uniref 
  1. Export dram config file.
        DRAM-setup.py export_config --output_file dram_cofnig_file.txt
  1. add the path to the dram config file to that atlas config.yaml
dram_config_file : path/to/dram_cofnig_file.txt
  1. Run atlas, but do a dry-ran first to check that atlas skips the dram_download rule.

SilasK avatar Jun 21 '22 15:06 SilasK

Note to me:

  • [ ] dram_download would be part of atlas download

SilasK avatar Jun 21 '22 15:06 SilasK

Hello,

I had encountered the same problem. While I was trying this solution I was removed from the terminal.

So the problem is that you cannot download the database on the cluster node, but the main node had internet access.

Then you can simply run

atlas run None dram_download

no profile argument. atlas will only do this step.

If this doesn't work or you have already downloaded the Dram databases you can follow the following steps.

Download DRam database independent of atlas:

  1. Install Dram or activate the atlas dram environment.
  2. Download Dram databases
DRAM-setup.py prepare_databases "
 --output_dir {dram_dbdir} "
 --threads 4 
 --skip_uniref 
  1. Export dram config file.
        DRAM-setup.py export_config --output_file dram_cofnig_file.txt
  1. add the path to the dram config file to that atlas config.yaml
dram_config_file : path/to/dram_cofnig_file.txt
  1. Run atlas, but do a dry-ran first to check that atlas skips the dram_download rule.

I am not sure that my solution is official but It worked for me.

Firstly, I deleted the DRAM environment which is downloaded by ATLAS, then replaced the /atlas/workflow/envs/dram.yaml's dependencies are as follows: ( I got the information from https://github.com/WrightonLabCSU/DRAM/blob/master/environment.yaml)

dependencies:
  - python>=3.6
  - pandas
  - pytest
  - scikit-bio
  - prodigal
  - mmseqs2!=10.6d92c
  - hmmer!=3.3.1
  - trnascan-se >=2
  - scipy!=1.9.0
  - sqlalchemy
  - barrnap
  - altair >=4
  - openpyxl
  - networkx
  - ruby
  - parallel
  - dram

And I run this command, due to the fact that It wants a huge amount of memory, I had to run on my cluster. In my opinion, you can ignore snakemake arguments. But I had to do this

atlas run None dram_download --conda-frontend mamba --profile cluster

When It is finished, I activated the installed DRAM environment and I run these commands following:

DRAM-setup.py --update_description_db

Finally, I continued the atlas pipeline.

In my case, I successfully passed the DRAM step. I don't know if this is the official solution. @SilasK might clarify that.

Further reads https://github.com/WrightonLabCSU/DRAM/issues/191 https://github.com/WrightonLabCSU/DRAM/issues/188

omrctnr avatar Aug 18 '22 06:08 omrctnr

@omrctnr It worked for you isn't?

SilasK avatar Aug 18 '22 07:08 SilasK

@SilasK yes exactly.

omrctnr avatar Aug 18 '22 08:08 omrctnr