DRAM icon indicating copy to clipboard operation
DRAM copied to clipboard

DRAM error while annotation

Open poddarharsh15 opened this issue 3 years ago • 11 comments

5 fastas found 2022-07-15 15:49:14.912901: Annotation started 0:00:00.045806: Retrieved database locations and descriptions 0:00:00.045894: Annotating MEGAHIT-group-0.5 0:00:21.937310: Turning genes from prodigal to mmseqs2 db 0:00:23.980066: Getting hits from kofam 0:21:38.160079: Getting forward best hits from peptidase 0:22:06.199097: Getting reverse best hits from peptidase 0:22:07.549270: Getting descriptions of hits from peptidase /home/alifchebbi/anaconda3/envs/DRAM/lib/python3.10/site-packages/mag_annotator/database_handler.py:81: UserWarning: No descriptions were found for your id's. Does this MER0151453 look like an id from peptidase_description warnings.warn("No descriptions were found for your id's. Does this %s look like an id from %s" % (list(ids)[0], Traceback (most recent call last): File "/home/alifchebbi/anaconda3/envs/DRAM/bin/DRAM.py", line 189, in args.func(**args_dict) File "/home/alifchebbi/anaconda3/envs/DRAM/lib/python3.10/site-packages/mag_annotator/annotate_bins.py", line 1040, in annotate_bins_cmd annotate_bins(list(set(fasta_locs)), output_dir, min_contig_size, prodigal_mode, trans_table, bit_score_threshold, File "/home/alifchebbi/anaconda3/envs/DRAM/lib/python3.10/site-packages/mag_annotator/annotate_bins.py", line 1079, in annotate_bins all_annotations = annotate_fastas(fasta_locs, output_dir, db_handler, min_contig_size, prodigal_mode, trans_table, File "/home/alifchebbi/anaconda3/envs/DRAM/lib/python3.10/site-packages/mag_annotator/annotate_bins.py", line 1013, in annotate_fastas annotate_fasta(fasta_loc, fasta_name, fasta_dir, db_handler, min_contig_size, prodigal_mode, trans_table, File "/home/alifchebbi/anaconda3/envs/DRAM/lib/python3.10/site-packages/mag_annotator/annotate_bins.py", line 921, in annotate_fasta annotations = annotate_orfs(gene_faa, db_handler, tmp_dir, start_time, custom_db_locs, custom_hmm_locs, File "/home/alifchebbi/anaconda3/envs/DRAM/lib/python3.10/site-packages/mag_annotator/annotate_bins.py", line 821, in annotate_orfs annotation_list.append(do_blast_style_search(query_db, db_handler.db_locs['peptidase'], tmp_dir, File "/home/alifchebbi/anaconda3/envs/DRAM/lib/python3.10/site-packages/mag_annotator/annotate_bins.py", line 684, in do_blast_style_search hits = formater(hits, header_dict) File "/home/alifchebbi/anaconda3/envs/DRAM/lib/python3.10/site-packages/mag_annotator/annotate_bins.py", line 187, in get_peptidase_description header = header_dict[peptidase_hit] KeyError: 'MER0151453' I was trying to update the database after having this error but every time I was trying to update the command line gives output as killed after 20 minutes. Could you please suggest me some ideas. Thank you so much.

poddarharsh15 avatar Jul 18 '22 08:07 poddarharsh15

Try DRAM-setup.py --update_description_db and see if that fixes the issue, let me know if not.

rmFlynn avatar Jul 18 '22 15:07 rmFlynn

Thanks for answering me I have executed DRAM-setup.py --update_description_db this code and after 30 minutes it gives an output KILLED. Does this mean the updating of database is finished?

poddarharsh15 avatar Jul 18 '22 16:07 poddarharsh15

No, It most likely means that the memory is not sufficient. How much memory do you have/did you give dram? This should take 3 hours or more.

rmFlynn avatar Jul 18 '22 17:07 rmFlynn

The workstation we are using has 256GB RAM and at the moment we have almost 1TB HDD space free for DRAM. But still we are facing problems in updating the database correctly.

poddarharsh15 avatar Jul 18 '22 17:07 poddarharsh15

That should be enough if it is all available, but try free and see how much is available. If it is over 200GB then try running the same command with /usr/bin/time -v as a prefix. Look for Maximum resident set size (kbytes) in the output. It may be killed by an early OOM system of some sort, if it uses up all the memory. Something is killing it for sure.

rmFlynn avatar Jul 18 '22 17:07 rmFlynn

Thank you so much for your suggestions I will try these commands tomorrow and I will get back to you ASAP I really appreciate your help.

poddarharsh15 avatar Jul 18 '22 17:07 poddarharsh15

Hello again, I was checking the memory by using command free the output I get I am pasting them here - total used free shared buff/cache available Mem: 263839512 3070700 206545060 133464 54223752 258786668 Swap: 2097148 1169668 927480 and after this I used /usr/bin/time -v free command as you told me to do and the output is - total used free shared buff/cache available Mem: 263839512 3134952 206461260 140804 54243300 258715056 Swap: 2097148 1169644 927504 Command being timed: "free" User time (seconds): 0.00 System time (seconds): 0.00 Percent of CPU this job got: 100% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.00 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 3056 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 138 Voluntary context switches: 1 Involuntary context switches: 0 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 maximum resident size memory is very low. Could you please suggest me what should I do so the it can work. Thanks in advance.

poddarharsh15 avatar Jul 19 '22 08:07 poddarharsh15

Oh sorry I was unclear, run /usr/bin/time -v DRAM-setup.py --update_description_db that should give you an idea of the memory usage of the DRAM command the above is just the time command.

rmFlynn avatar Jul 19 '22 15:07 rmFlynn

You can also try adding the --skip_uniref argument to improve results.

rmFlynn avatar Jul 22 '22 14:07 rmFlynn

/usr/bin/time -v DRAM-setup.py --update_description_db usage: DRAM-setup.py [-h] {version,prepare_databases,set_database_locations,update_description_db,update_dram_forms,print_config,import_config,export_config} ... DRAM-setup.py: error: unrecognized arguments: --update_description_db Command exited with non-zero status 2 Command being timed: "DRAM-setup.py --update_description_db" User time (seconds): 3.13 System time (seconds): 2.88 Percent of CPU this job got: 407% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:01.47 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 157952 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 29626 Voluntary context switches: 140 Involuntary context switches: 108483 Swaps: 0 File system inputs: 0 File system outputs: 8 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 2 Good evening again, Sorry for answering you late I couldn't run the codes finally I have managed to execute the codes and I am attaching outputs with this message could please check them. Many thanks.

second run---

/usr/bin/time -v DRAM-setup.py update_description_db Traceback (most recent call last): File "/home/alifchebbi/anaconda3/envs/DRAM/bin/DRAM-setup.py", line 158, in args.func(**args_dict) File "/home/alifchebbi/anaconda3/envs/DRAM/lib/python3.10/site-packages/mag_annotator/database_handler.py", line 344, in populate_description_db db_handler.populate_description_db(output_loc) File "/home/alifchebbi/anaconda3/envs/DRAM/lib/python3.10/site-packages/mag_annotator/database_handler.py", line 235, in populate_description_db self.add_descriptions_to_database(self.make_header_dict_from_mmseqs_db(self.db_locs['uniref']) , File "/home/alifchebbi/anaconda3/envs/DRAM/lib/python3.10/site-packages/mag_annotator/database_handler.py", line 155, in make_header_dict_from_mmseqs_db mmseqs_headers_handle = open('%s_h' % mmseqs_db, 'rb') FileNotFoundError: [Errno 2] No such file or directory: '/media/alifchebbi/nextseq/alifch/alif/DRAM/databases/uniref90.20220617.mmsdb_h' Command exited with non-zero status 1 Command being timed: "DRAM-setup.py update_description_db" User time (seconds): 3.42 System time (seconds): 2.91 Percent of CPU this job got: 205% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:03.08 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 159072 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 4 Minor (reclaiming a frame) page faults: 29856 Voluntary context switches: 252 Involuntary context switches: 26085 Swaps: 0 File system inputs: 1640 File system outputs: 616 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 1

poddarharsh15 avatar Jul 25 '22 14:07 poddarharsh15

The last error is a result of a failed setup, you will want to reinstall DRAM from Conda before your next attempt. There could also be something in your server environment that is preventing dram from setting up in multiple ways. This problem will be difficult to solve without your IT department's help and could be the memory still for which there is no solution. Your best bet it to use a prebuilt db. There is a secrete pre-made database here: https://zenodo.org/record/3827510 it is smaller, and should be faster, but it is way out of date. Also, I can't promise it will work. This is discussed in more depth in issue 30. Good Luck!

rmFlynn avatar Aug 05 '22 22:08 rmFlynn