GetOrganelle
GetOrganelle copied to clipboard
animal_mt: No valid Assembly graph found: an example of --reduce-reads-for-coverage
Hi @Kinggerm, I am trying to assemble the mitochondrial genome using get organelle tool but I am keep getting this error of No valid assembly graph found. I looked in the issues section and implemented everything like giving a seed input and not including any parenthesis in the directory path but still the result is same. Please help me out when convenient to yourself. I am attaching the log files of spade and getorganelle and I am also providing the command I used: get_organelle_from_reads.py -1 ../ERR194146.R1.fastq -2 ../ERR194146.R2.fastq -o ERR194146 -t 16 -F animal_mt -s /nfs_master/nirmal/raw/GRCh38.primary_assembly.chrM.fa Where GRCh38.primary_assembly.chrM.fa is the current human mitochondrial reference genome. Please take a look at this when convenient to yourself. get_org.log.txt slim.log.txt slim.log.txt
Please try '--reduce-reads-for-coverage inf --max-reads inf' first, it could be caused by wrong estimation of the target depth.
Hi @Kinggerm, thank you for your quick response. Actually I tried this method you mentioned above but the tool is taking approximately 3 to 4 days to complete. Can you suggest some optimum values for these parameters so that the assembly can finish as early as possible. Thank you in advance.
Optimal values cannot be given prior to a successful run, otherwise it will be incorporated into the software. Does the 3-4 day running finish with good results? If so, attach the log file so that I can see if there is room to fine-tune.
Hi @Kinggerm, Sorry for this late response because of some health issue I couldn't complete the task. I finally did what you asked for and I am attaching the command as well as the log file. Please go through it and suggest whether there is some room to fine-tune. Command I used: get_organelle_from_reads.py -1 ../../ERR194146.R1.fastq -2 ../../ERR194146.R2.fastq -o test -t 64 -F animal_mt --reduce-reads-for-coverage inf --max-reads inf log file: get_org.log.txt Time taken: 80 hours 23 minutes 33 seconds. Threads Used: 64 RAM: 1 TB OS: CentOS Linux 7 (Core).
Please ask if you need any more details.
Thank you so much for your help and time.
Thanks for getting back. Hope you are doing well.
As we can see from the --reduce-reads-for-coverage inf --max-reads inf
log file, the pre-assembly coverage estimation of 12286 is quite close to post-assembly result of 12154, which is a great thing, but weird to me that the default parameter cannot work though.
Anyway, try using --reduce-reads-for-coverage 2000
or a similar value should greatly reduce the computational burden.
Hi @Kinggerm , thank you so much for such a fast response. I am really glad for that and I will try this approach and get back to you regarding this. The only thing I wanted to confirm is to whether use --max-reads inf parameter or not. Thank you so much.
Hi @Kinggerm , thank you so much for such a fast response. I am really glad for that and I will try this approach and get back to you regarding this. The only thing I wanted to confirm is to whether use --max-reads inf parameter or not. Thank you so much.
--max-reads
and --reduce-reads-for-coverage
both work to limit the amount of reads, whichever is smaller. Given that coverage estimation works for your taxa and that --reduce-reads-for-coverage
will take a strong limit, there is no need to set a value for --max-reads
.
Okay, thank you so much for your response. I will try this approach and get back to you. Thank you so much once again.
Take the liberty to borrow the post owner's place. I want to ask where my error occurred. It always occurs No valid Assembly graph found.@Kinggerm First, the code I used:get_organelle_from_reads.py -1 /home/mxx/anaconda3/envs/get/BR1_FDMS210380612-1a_1.clean.fq.gz -2 /home/mxx/anaconda3/envs/get/BR1_FDMS210380612-1a_2.clean.fq.gz -o test5 -R 32 -t 64 -F animal_mt --reduce-reads-for-coverage inf --max-reads inf log file: get_org.log.txt spades.log Please take a look at this when convenient to yourself.
@9326xiaoxiao Your issue is different. Please check #198
Hello, teacher. Thank you for your quick reply. I sent the text5 file again. Do you have time to explain the environmental issues in detail? thank you!
At 2022-09-29 10:30:00, "JianjunJin" @.***> wrote:
@9326xiaoxiao Your issue is different. Please check #198
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
GetOrganelle v1.7.6.1
get_organelle_from_reads.py assembles organelle genomes from genome skimming data. Find updates in https://github.com/Kinggerm/GetOrganelle and see README.md for more information.
Python 3.10.6 | packaged by conda-forge | (main, Aug 22 2022, 20:35:26) [GCC 10.4.0] PLATFORM: Linux antias-Precision-7920-Tower 5.4.0-124-generic #140-Ubuntu SMP Thu Aug 4 02:23:37 UTC 2022 x86_64 x86_64 PYTHON LIBS: GetOrganelleLib 1.7.6.1; numpy 1.23.3; sympy 1.10.1; scipy 1.9.1 DEPENDENCIES: Bowtie2 2.4.5; SPAdes 3.13.0; Blast 2.5.0 GETORG_PATH=/home/mxx/.GetOrganelle SEED DB: animal_mt 0.0.0 LABEL DB: animal_mt 0.0.1 WORKING DIR: /home/mxx /home/mxx/anaconda3/envs/get/bin/get_organelle_from_reads.py -1 /home/mxx/anaconda3/envs/get/BR1_FDMS210380612-1a_1.clean.fq.gz -2 /home/mxx/anaconda3/envs/get/BR1_FDMS210380612-1a_2.clean.fq.gz -o test5 -R 32 -t 64 -F animal_mt --reduce-reads-for-coverage inf --max-reads inf
2022-09-28 23:29:11,754 - INFO: Pre-reading fastq ... 2022-09-28 23:29:11,755 - INFO: Unzipping reads file: /home/mxx/anaconda3/envs/get/BR1_FDMS210380612-1a_1.clean.fq.gz (2179665805 bytes) 2022-09-28 23:30:09,808 - INFO: Unzipping reads file: /home/mxx/anaconda3/envs/get/BR1_FDMS210380612-1a_2.clean.fq.gz (2223488277 bytes) 2022-09-28 23:31:08,054 - INFO: Counting read qualities ... 2022-09-28 23:31:08,199 - INFO: Identified quality encoding format = Sanger 2022-09-28 23:31:08,199 - INFO: Phred offset = 33 2022-09-28 23:31:08,200 - INFO: Trimming bases with qualities (0.00%): 33..33 ! 2022-09-28 23:31:08,234 - INFO: Mean error rate = 0.0026 2022-09-28 23:31:08,235 - INFO: Counting read lengths ... 2022-09-28 23:32:03,884 - INFO: Mean = 150.0 bp, maximum = 150 bp. 2022-09-28 23:32:03,884 - INFO: Reads used = 32254588+32254588 2022-09-28 23:32:03,884 - INFO: Pre-reading fastq finished.
2022-09-28 23:32:03,884 - INFO: Making seed reads ... 2022-09-28 23:32:03,885 - INFO: Seed bowtie2 index existed! 2022-09-28 23:32:03,885 - INFO: Mapping reads to seed bowtie2 index ... 2022-09-28 23:48:20,109 - INFO: Mapping finished. 2022-09-28 23:48:20,110 - INFO: Seed reads made: test5/seed/animal_mt.initial.fq (3252919 bytes) 2022-09-28 23:48:20,112 - INFO: Making seed reads finished.
2022-09-28 23:48:20,112 - INFO: Checking seed reads and parameters ... 2022-09-28 23:48:20,112 - INFO: The automatically-estimated parameter(s) do not ensure the best choice(s). 2022-09-28 23:48:20,113 - INFO: If the result graph is not a circular organelle genome, 2022-09-28 23:48:20,113 - INFO: you could adjust the value(s) of '-w'/'-R' for another new run. 2022-09-28 23:48:22,334 - INFO: Pre-assembling mapped reads ... 2022-09-28 23:48:22,835 - INFO: Retrying with more reads .. 2022-09-29 00:02:12,944 - WARNING: Pre-assembling failed. The estimations for animal_mt-hitting base-coverage and word size may be misleading. 2022-09-29 00:02:18,285 - INFO: Estimated animal_mt-hitting base-coverage = 305.68 2022-09-29 00:02:18,554 - INFO: Estimated word size(s): 119 2022-09-29 00:02:18,554 - INFO: Setting '-w 119' 2022-09-29 00:02:18,554 - INFO: Setting '--max-extending-len inf' 2022-09-29 00:02:18,691 - INFO: Checking seed reads and parameters finished.
2022-09-29 00:02:18,691 - INFO: Making read index ... 2022-09-29 00:08:03,175 - INFO: 53810080 candidates in all 64509176 reads 2022-09-29 00:08:03,175 - INFO: Pre-grouping reads ... 2022-09-29 00:08:03,175 - INFO: Setting '--pre-w 119' 2022-09-29 00:08:08,081 - INFO: 200000/7054047 used/duplicated 2022-09-29 00:08:16,987 - INFO: 6035 groups made. 2022-09-29 00:08:22,883 - INFO: Making read index finished.
2022-09-29 00:08:22,883 - INFO: Extending ... 2022-09-29 00:08:22,883 - INFO: Adding initial words ... 2022-09-29 00:08:23,013 - INFO: AW 69574 2022-09-29 00:11:10,210 - INFO: Round 1: 53810080/53810080 AI 55903 AW 313224 2022-09-29 00:14:01,165 - INFO: Round 2: 53810080/53810080 AI 74812 AW 427258 2022-09-29 00:16:53,256 - INFO: Round 3: 53810080/53810080 AI 83991 AW 485830 2022-09-29 00:19:46,768 - INFO: Round 4: 53810080/53810080 AI 94188 AW 526156 2022-09-29 00:22:40,683 - INFO: Round 5: 53810080/53810080 AI 94421 AW 530994 2022-09-29 00:25:34,244 - INFO: Round 6: 53810080/53810080 AI 94462 AW 531634 2022-09-29 00:28:27,846 - INFO: Round 7: 53810080/53810080 AI 94489 AW 532002 2022-09-29 00:31:22,043 - INFO: Round 8: 53810080/53810080 AI 94516 AW 532390 2022-09-29 00:34:15,941 - INFO: Round 9: 53810080/53810080 AI 94542 AW 532744 2022-09-29 00:37:09,484 - INFO: Round 10: 53810080/53810080 AI 94559 AW 532878 2022-09-29 00:40:03,430 - INFO: Round 11: 53810080/53810080 AI 94568 AW 532964 2022-09-29 00:42:57,934 - INFO: Round 12: 53810080/53810080 AI 94583 AW 533166 2022-09-29 00:45:51,501 - INFO: Round 13: 53810080/53810080 AI 94589 AW 533242 2022-09-29 00:48:45,230 - INFO: Round 14: 53810080/53810080 AI 94594 AW 533322 2022-09-29 00:51:38,950 - INFO: Round 15: 53810080/53810080 AI 94602 AW 533454 2022-09-29 00:54:32,591 - INFO: Round 16: 53810080/53810080 AI 94622 AW 533656 2022-09-29 00:57:26,444 - INFO: Round 17: 53810080/53810080 AI 94643 AW 533866 2022-09-29 01:00:20,326 - INFO: Round 18: 53810080/53810080 AI 94653 AW 533926 2022-09-29 01:03:14,246 - INFO: Round 19: 53810080/53810080 AI 94661 AW 534056 2022-09-29 01:06:08,152 - INFO: Round 20: 53810080/53810080 AI 94685 AW 534364 2022-09-29 01:09:01,830 - INFO: Round 21: 53810080/53810080 AI 94713 AW 534544 2022-09-29 01:11:55,556 - INFO: Round 22: 53810080/53810080 AI 94722 AW 534658 2022-09-29 01:14:50,300 - INFO: Round 23: 53810080/53810080 AI 94739 AW 534898 2022-09-29 01:17:44,133 - INFO: Round 24: 53810080/53810080 AI 94746 AW 534986 2022-09-29 01:20:37,931 - INFO: Round 25: 53810080/53810080 AI 94747 AW 535014 2022-09-29 01:23:31,789 - INFO: Round 26: 53810080/53810080 AI 94749 AW 535052 2022-09-29 01:26:25,502 - INFO: Round 27: 53810080/53810080 AI 94752 AW 535102 2022-09-29 01:29:19,592 - INFO: Round 28: 53810080/53810080 AI 94762 AW 535220 2022-09-29 01:32:13,464 - INFO: Round 29: 53810080/53810080 AI 94766 AW 535228 2022-09-29 01:35:07,326 - INFO: Round 30: 53810080/53810080 AI 94773 AW 535346 2022-09-29 01:38:01,282 - INFO: Round 31: 53810080/53810080 AI 94779 AW 535378 2022-09-29 01:40:55,142 - INFO: Round 32: 53810080/53810080 AI 94779 AW 535378 2022-09-29 01:40:55,143 - INFO: No more reads found and terminated ... 2022-09-29 01:41:54,458 - INFO: Extending finished.
2022-09-29 01:41:56,501 - INFO: Separating extended fastq file ... 2022-09-29 01:41:56,936 - INFO: Setting '-k 21,55,85,115' 2022-09-29 01:41:56,936 - INFO: Assembling using SPAdes ... 2022-09-29 01:41:56,941 - INFO: spades.py -t 64 --phred-offset 33 -1 test5/extended_1_paired.fq -2 test5/extended_2_paired.fq --s1 test5/extended_1_unpaired.fq --s2 test5/extended_2_unpaired.fq -k 21,55,85,115 -o test5/extended_spades 2022-09-29 01:41:57,143 - WARNING: Assembling exited halfway.
2022-09-29 01:41:57,222 - ERROR: No valid assembly graph found!
Total cost 7966.74 s Thank you!
Command line: /home/mxx/anaconda3/envs/get/bin/spades.py -t 64 --phred-offset 33 -1 /home/mxx/test5/extended_1_paired.fq -2 /home/mxx/test5/extended_2_paired.fq --s1 /home/mxx/test5/extended_1_unpaired.fq --s2 /home/mxx/test5/extended_2_unpaired.fq -k 21,55,85,115 -o /home/mxx/test5/extended_spades
System information: SPAdes version: 3.13.0 Python version: 3.10.6 OS: Linux-5.4.0-124-generic-x86_64-with-glibc2.31
Output dir: /home/mxx/test5/extended_spades Mode: read error correction and assembling Debug mode is turned OFF
Dataset parameters: Multi-cell mode (you should set '--sc' flag if input data was obtained with MDA (single-cell) technology or --meta flag if processing metagenomic dataset) Reads:
Hi, I also have the same problom, the code and error info like this, and I try all methods above, but it does not work I think. (luozhisen) starv2-PowerEdge-R7525:MuSW004A $ get_organelle_from_reads.py -1 MuSW004A_1_clean.fq.gz -2 MuSW004A_2_clean.fq.gz -R 10 -F animal_mt -t 4 -o animal_mt_out --reduce-reads-for-coverage inf --max-reads inf
GetOrganelle v1.7.7.0
get_organelle_from_reads.py assembles organelle genomes from genome skimming data. Find updates in https://github.com/Kinggerm/GetOrganelle and see README.md for more information.
Python 3.8.19 (default, Mar 20 2024, 19:58:24) [GCC 11.2.0] PLATFORM: Linux starv2-PowerEdge-R7525 6.5.0-21-generic #21~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Fri Feb 9 13:32:52 UTC 2 x86_64 x86_64 PYTHON LIBS: GetOrganelleLib 1.7.7.0; numpy 1.24.3; sympy 1.12; scipy 1.10.1 DEPENDENCIES: Bowtie2 2.4.1; SPAdes 3.13.1; Blast 2.14.1 GETORG_PATH=/home/data/t040503/.GetOrganelle SEED DB: animal_mt 0.0.1 LABEL DB: animal_mt 0.0.1 WORKING DIR: /home/data/t040503/lzs/MuSW004A /home/data/t040503/miniconda3/envs/luozhisen/bin/get_organelle_from_reads.py -1 MuSW004A_1_clean.fq.gz -2 MuSW004A_2_clean.fq.gz -R 10 -F animal_mt -t 4 -o animal_mt_out --reduce-reads-for-coverage inf --max-reads inf
2024-04-04 21:11:10,945 - INFO: Pre-reading fastq ... 2024-04-04 21:11:10,946 - INFO: Unzipping reads file: MuSW004A_1_clean.fq.gz (6782072184 bytes) 2024-04-04 21:13:50,907 - INFO: Unzipping reads file: MuSW004A_2_clean.fq.gz (6807446106 bytes) 2024-04-04 21:16:30,538 - INFO: Counting read qualities ... 2024-04-04 21:16:30,673 - INFO: Identified quality encoding format = Sanger 2024-04-04 21:16:30,673 - INFO: Phred offset = 33 2024-04-04 21:16:30,674 - INFO: Trimming bases with qualities (0.00%): 33..33 ! 2024-04-04 21:16:30,745 - INFO: Mean error rate = 0.0011 2024-04-04 21:16:30,746 - INFO: Counting read lengths ... 2024-04-04 21:18:32,035 - INFO: Mean = 148.8 bp, maximum = 150 bp. 2024-04-04 21:18:32,036 - INFO: Reads used = 61902571+61902571 2024-04-04 21:18:32,036 - INFO: Pre-reading fastq finished.
2024-04-04 21:18:32,036 - INFO: Making seed reads ... 2024-04-04 21:18:32,036 - INFO: Seed bowtie2 index existed! 2024-04-04 21:18:32,036 - INFO: Mapping reads to seed bowtie2 index ... 2024-04-04 21:31:03,557 - INFO: Mapping finished. 2024-04-04 21:31:03,557 - INFO: Seed reads made: animal_mt_out/seed/animal_mt.initial.fq (4590030 bytes) 2024-04-04 21:31:03,560 - INFO: Making seed reads finished.
2024-04-04 21:31:03,560 - INFO: Checking seed reads and parameters ... 2024-04-04 21:31:03,560 - INFO: The automatically-estimated parameter(s) do not ensure the best choice(s). 2024-04-04 21:31:03,560 - INFO: If the result graph is not a circular organelle genome, 2024-04-04 21:31:03,560 - INFO: you could adjust the value(s) of '-w'/'-R' for another new run. 2024-04-04 21:31:05,941 - INFO: Pre-assembling mapped reads ... 2024-04-04 21:31:06,573 - INFO: Retrying with more reads .. 2024-04-04 22:03:43,341 - WARNING: Pre-assembling failed. The estimations for animal_mt-hitting base-coverage and word size may be misleading. 2024-04-04 22:03:51,172 - INFO: Estimated animal_mt-hitting base-coverage = 455.92 2024-04-04 22:03:51,423 - INFO: Estimated word size(s): 119 2024-04-04 22:03:51,424 - INFO: Setting '-w 119' 2024-04-04 22:03:51,424 - INFO: Setting '--max-extending-len inf' 2024-04-04 22:03:51,522 - INFO: Checking seed reads and parameters finished.
2024-04-04 22:03:51,523 - INFO: Making read index ... 2024-04-04 22:16:39,226 - INFO: 120506731 candidates in all 123805142 reads 2024-04-04 22:16:39,226 - INFO: Pre-grouping reads ... 2024-04-04 22:16:39,226 - INFO: Setting '--pre-w 119' 2024-04-04 22:16:46,842 - INFO: 200000/1006462 used/duplicated 2024-04-04 22:17:05,412 - INFO: 3060 groups made. 2024-04-04 22:17:15,652 - INFO: Making read index finished.
2024-04-04 22:17:15,655 - INFO: Extending ... 2024-04-04 22:17:15,655 - INFO: Adding initial words ... 2024-04-04 22:17:15,915 - INFO: AW 117098 2024-04-04 22:24:50,128 - INFO: Round 1: 120506731/120506731 AI 263086 AW 760158 2024-04-04 22:33:02,138 - INFO: Round 2: 120506731/120506731 AI 269873 AW 880108 2024-04-04 22:41:36,774 - INFO: Round 3: 120506731/120506731 AI 370492 AW 1514412 2024-04-04 22:51:28,060 - INFO: Round 4: 120506731/120506731 AI 580176 AW 2483826 2024-04-04 23:00:40,465 - INFO: Round 5: 120506731/120506731 AI 640710 AW 2941538 2024-04-04 23:09:39,029 - INFO: Round 6: 120506731/120506731 AI 720053 AW 3420160 2024-04-04 23:18:55,119 - INFO: Round 7: 120506731/120506731 AI 744047 AW 3633410 2024-04-04 23:28:09,184 - INFO: Round 8: 120506731/120506731 AI 755260 AW 3730806 2024-04-04 23:37:21,176 - INFO: Round 9: 120506731/120506731 AI 760490 AW 3786126 2024-04-04 23:47:05,811 - INFO: Round 10: 120506731/120506731 AI 765893 AW 3840872 2024-04-04 23:47:05,811 - INFO: Hit the round limit 10 and terminated ... 2024-04-04 23:49:26,862 - INFO: Extending finished.
2024-04-04 23:49:41,651 - INFO: Separating extended fastq file ... 2024-04-04 23:49:45,306 - INFO: Setting '-k 21,55,85,115' 2024-04-04 23:49:45,307 - INFO: Assembling using SPAdes ... 2024-04-04 23:49:45,366 - INFO: spades.py -t 4 --phred-offset 33 -1 animal_mt_out/extended_1_paired.fq -2 animal_mt_out/extended_2_paired.fq --s1 animal_mt_out/extended_1_unpaired.fq --s2 animal_mt_out/extended_2_unpaired.fq -k 21,55,85,115 -o animal_mt_out/extended_spades 2024-04-04 23:49:46,358 - WARNING: Assembling exited halfway.
2024-04-04 23:49:46,888 - ERROR: No valid assembly graph found! 2024-04-04 23:49:46,889 - WARNING: This might due to a damaged dependency, to unreasonable seed/parameter choices, or to a bug. 2024-04-04 23:49:46,889 - INFO: Please first search similar issues at https://github.com/Kinggerm/GetOrganelle/issues, then leave your message following the same issue, or open an issue at https://github.com/Kinggerm/GetOrganelle/issues if it is new, Please always attach the get_org.log.txt file.
Total cost 9517.53 s Thank you! Can you see why?