GetOrganelle icon indicating copy to clipboard operation
GetOrganelle copied to clipboard

Error in GetOrganelle run - potentially related to bowtie2 execution

Open greenwoodmp opened this issue 2 years ago • 2 comments

Hello,

I've encountered an issue which prevents me from reproducing the example code for GetOrganelle. Essentially, it seems that GetOrganelle is searching for the bowtie2 large index file (.bt2l), which isn't produced by bowtie2 when I run it. Comparing my log file to the provided example, I am not provided with the "Seed bowtie2 index existed!" notification, after which my log file starts to diverge from the example file.

Code: TOOLS/GetOrganelle/get_organelle_from_reads.py -1 Arabidopsis_simulated.1.fq.gz -2 Arabidopsis_simulated.2.fq.gz -t 1 -o Araplastome --overwrite -F embplant_pt

Log:


GetOrganelle v1.7.6.1

get_organelle_from_reads.py assembles organelle genomes from genome skimming data. Find updates in https://github.com/Kinggerm/GetOrganelle and see README.md for more information.

Python 3.8.11 (default, Jun 28 2021, 10:57:31) [GCC 10.3.0] PLATFORM: Linux dahu113 4.9.0-16-amd64 #1 SMP Debian 4.9.272-2 (2021-07-19) x86_64 PYTHON LIBS: GetOrganelleLib 1.7.6.1; numpy 1.20.2; sympy 1.7.1; scipy 1.6.1; psutil 5.8.0 DEPENDENCIES: Bowtie2 2.4.3; SPAdes 3.15.2; Blast 2.11.0 GETORG_PATH=/home/.GetOrganelle SEED DB: embplant_pt 0.0.0; embplant_mt 0.0.1.minima LABEL DB: embplant_pt 0.0.1; embplant_mt 0.0.1 WORKING DIR: /home/ TOOLS/GetOrganelle/get_organelle_from_reads.py -1 Arabidopsis_simulated.1.fq.gz -2 Arabidopsis_simulated.2.fq.gz -t 1 -o Araplastome --overwrite -F embplant_pt

2022-07-04 14:10:17,692 - INFO: Pre-reading fastq ... 2022-07-04 14:10:17,692 - INFO: Estimating reads to use ... (to use all reads, set '--reduce-reads-for-coverage inf --max-reads inf') 2022-07-04 14:10:17,820 - INFO: Estimating reads to use finished. 2022-07-04 14:10:17,820 - INFO: Unzipping reads file: Arabidopsis_simulated.1.fq.gz (8796915 bytes) 2022-07-04 14:10:18,181 - INFO: Unzipping reads file: Arabidopsis_simulated.2.fq.gz (9067061 bytes) 2022-07-04 14:10:18,561 - INFO: Counting read qualities ... 2022-07-04 14:10:18,807 - INFO: Identified quality encoding format = Illumina 1.8+ 2022-07-04 14:10:18,807 - INFO: Phred offset = 33 2022-07-04 14:10:18,810 - INFO: Trimming bases with qualities (0.00%): 33..33 ! 2022-07-04 14:10:18,894 - INFO: Mean error rate = 0.0019 2022-07-04 14:10:18,895 - INFO: Counting read lengths ... 2022-07-04 14:10:19,298 - INFO: Mean = 150.0 bp, maximum = 150 bp. 2022-07-04 14:10:19,298 - INFO: Reads used = 91563+91563 2022-07-04 14:10:19,299 - INFO: Pre-reading fastq finished.

2022-07-04 14:10:19,299 - INFO: Making seed reads ... 2022-07-04 14:10:19,448 - INFO: Making seed - bowtie2 index ... 2022-07-04 14:10:37,915 - INFO: Making seed - bowtie2 index finished. 2022-07-04 14:10:37,916 - INFO: Mapping reads to seed bowtie2 index ... 2022-07-04 14:10:37,955 - ERROR: (ERR): Cannot find the large index Araplastome/test2/seed/embplant_pt.index.1.bt2l Exiting now ...

2022-07-04 14:10:37,956 - ERROR: Traceback (most recent call last): File "TOOLS/GetOrganelle/get_organelle_from_reads.py", line 3941, in main seed_fq, seed_sam, new_seed_f = making_seed_reads_using_mapping( File "TOOLS/GetOrganelle/get_organelle_from_reads.py", line 3040, in making_seed_reads_using_mapping map_with_bowtie2(seed_file=seed_file, original_fq_files=original_fq_files, File "/home/TOOLS/GetOrganelle/GetOrganelleLib/pipe_control_func.py", line 413, in map_with_bowtie2 raise Exception("") Exception

Total cost 23.65 s

############################## For trouble-shooting, please Firstly, check https://github.com/Kinggerm/GetOrganelle/wiki/FAQ Secondly, check if there are open/closed issues related at https://github.com/Kinggerm/GetOrganelle/issues If your problem was still not solved, please open an issue at https://github.com/Kinggerm/GetOrganelle/issues please provide the get_org.log.txt and the assembly_graph.fastg.*.fastg file(s) (can be visualized as *.png to protect your data privacy) if possible!

For reference, I am running GetOrganelle on a HPC which utilises NixOS for package management. I have the necessary dependencies for GetOrganelle installed, including a Python environment containing scipy, numpy, and sympy. I have set up the seed libraries using GetOrganelle/Utilities/get_organelle_config with the "--use-local" option run on the latest GetOrganelleDB data.


Is there any clear indication here of something I may have missed during the installation or running of GetOrganelle?

Cheers, Matt

greenwoodmp avatar Jul 04 '22 12:07 greenwoodmp

It seems that bowtie2-build did not properly run in both get_organelle_config initialization and a single get_organelle_from_reads run. Please check bowtie2-build under the terminal if it looks normal.

Kinggerm avatar Jul 04 '22 20:07 Kinggerm

Hi, I believe you are correct. I've tried to replicate the example with a version of bowtie2 that isn't derived from the Nix repository and it seems to work fine.

Thanks for the help!

greenwoodmp avatar Jul 06 '22 08:07 greenwoodmp