OrthoFinder
OrthoFinder copied to clipboard
Running OrthoFinder algorithm. Initial processing of each species ERROR: Blast0_0.txt is corrupted
Dear David,
I installed Orthofinder v2.5.4 via conda, however, when I run the ExampleData, I get the following error:
OrthoFinder version 2.5.4 Copyright (C) 2014 David Emms
2022-08-11 10:01:35 : Starting OrthoFinder 2.5.4 8 thread(s) for highly parallel tasks (BLAST searches etc.) 1 thread(s) for OrthoFinder algorithm
Checking required programs are installed
Test can run "mcl -h" - ok Test can run "fastme -i /home/cd791/orthofinder_tutorial/OrthoFinder/ExampleData/OrthoFinder/Results_Aug11_11/WorkingDirectory/SimpleTest.phy -o /home/cd791/orthofinder_tutorial/OrthoFinder/ExampleData/OrthoFinder/Results_Aug11_11/WorkingDirectory/SimpleTest.tre" - ok
Dividing up work for BLAST for parallel processing
2022-08-11 10:01:36 : Creating diamond database 1 of 4 2022-08-11 10:01:36 : Creating diamond database 2 of 4 2022-08-11 10:01:36 : Creating diamond database 3 of 4 2022-08-11 10:01:36 : Creating diamond database 4 of 4
Running diamond all-versus-all
Using 8 thread(s) 2022-08-11 10:01:36 : This may take some time.... 2022-08-11 10:01:36 : Done 0 of 16 2022-08-11 10:01:51 : Done all-versus-all sequence search
Running OrthoFinder algorithm
2022-08-11 10:01:52 : Initial processing of each species ERROR: Blast0_0.txt is corrupted Malformatted line in /home/cd791/orthofinder_tutorial/OrthoFinder/ExampleData/OrthoFinder/Results_Aug11_11/WorkingDirectory/Blast0_0.txt Offending line was:
ERROR: Error processing files Blast0_* Process Process-10: Traceback (most recent call last): File "/home/cd791/miniconda3/envs/orthofinder/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap self.run() File "/home/cd791/miniconda3/envs/orthofinder/lib/python3.10/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/cd791/miniconda3/envs/orthofinder/bin/scripts_of/main.py", line 529, in Worker_ProcessBlastHits WaterfallMethod.ProcessBlastHits(*args, d_pickle=d_pickle, qDoubleBlast=qDoubleBlast) File "/home/cd791/miniconda3/envs/orthofinder/bin/scripts_of/main.py", line 516, in ProcessBlastHits Bij = blast_file_processor.GetBLAST6Scores(seqsInfo, blastDir_list, seqsInfo.speciesToUse[iSpecies], seqsInfo.speciesToUse[jSpecies], qDoubleBlast=qDoubleBlast) File "/home/cd791/miniconda3/envs/orthofinder/bin/scripts_of/blast_file_processor.py", line 65, in GetBLAST6Scores for row in blastreader: _csv.Error: line contains NUL ERROR: An error occurred, please review the error messages they may contain useful information about the problem.
It also creates a diamond_output.txt.gz output in the ~/orthofinder_tutorial/OrthoFinder directory:
| gi|290752891|emb|CBH40866.1| 23.1 507 316 23 7 462 2 485 1.9e-09 56.2 gi|290752280|emb|CBH40251.1| gi|290752893|emb|CBH40868.1| 24.1 237 138 7 11 224 369 586 9.3e-09 53.9 gi|290752280|emb|CBH40251.1| gi|290752592|emb|CBH40564.1| 25.0 220 154 7 7 223 357 568 6.0e-08 51.2 gi|290752280|emb|CBH40251.1| gi|290752391|emb|CBH40362.1| 22.8 162 108 4 5 153 10 167 1.0e-07 50.4 gi|290752280|emb|CBH40251.1| gi|290752979|emb|CBH40955.1| 21.1 261 136 8 29 223 36 292 2.3e-07 49.3 gi|290752280|emb|CBH40251.1| gi|290752668|emb|CBH40641.1| 30.0 100 63 3 117 209 798 897 2.8e-05 42.4 gi|290752280|emb|CBH40251.1| gi|290752491|emb|CBH40463.1| 42.1 38 22 0 28 65 37 74 1.1e-04 40.4 gi|290752280|emb|CBH40251.1| gi|290752373|emb|CBH40344.1| 37.3 51 32 0 20 70 28 78 1.8e-04 39.7 gi|290752281|emb|CBH40252.1| gi|290752281|emb|CBH40252.1| 100.0 662 0 0 1 662 1 662 0.0e+00 1117.8 gi|290752282|emb|CBH40253.1| gi|290752282|emb|CBH40253.1| 100.0 325 0 0 1 325 1 325 2.8e-181 626.3 gi|290752283|emb|CBH40254.1| gi|290752283|emb|CBH40254.1| 100.0 220 0 0 1 220 1 220 5.3e-128 448.7 gi|290752283|emb|CBH40254.1| gi|290752284|emb|CBH40255.1| 50.7 213 104 1 1 213 1 212 1.1e-59 221.9 gi|290752284|emb|CBH40255.1| gi|290752284|emb|CBH40255.1| 100.0 217 0 0 1 217 1 217 1.2e-124 437.6 gi|290752284|emb|CBH40255.1| gi|290752283|emb|CBH40254.1| 50.7 213 104 1 1 212 1 213 4.0e-59 219.9 gi|290752285|emb|CBH40256.1| gi|290752285|emb|CBH40256.1| 100.0 621 0 0 1 621 1 621 0.0e+00 1138.6 diamond_output.txt
Any advice for these issues, please?
Thanks.
Best regards, Chiara
I'm also having this issue. Have you found a solution?
Edit 1:
After looking more into this issue, it appears to be something with diamond. I was able to partially fix it by doing conda install -c bioconda diamond=0.9.4
.
However, I'm having another issue now. I'm unsure if it's related:
Reconciling gene trees and species tree
---------------------------------------
Outgroup: Mycoplasma_hyopneumoniae
2022-10-22 23:57:40 : Starting Recon and orthologues
2022-10-22 23:57:40 : Starting OF Orthologues
Traceback (most recent call last):
File "/Users/user/opt/anaconda3/envs/longenv/bin/Orthofinder", line 7, in <module>
main(args)
File "/Users/user/opt/anaconda3/envs/longenv/bin/scripts_of/__main__.py", line 1778, in main
GetOrthologues(speciesInfoObj, options, prog_caller)
File "/Users/user/opt/anaconda3/envs/longenv/bin/scripts_of/__main__.py", line 1540, in GetOrthologues
orthologues.OrthologuesWorkflow(speciesInfoObj.speciesToUse,
File "/Users/user/opt/anaconda3/envs/longenv/bin/scripts_of/orthologues.py", line 1090, in OrthologuesWorkflow
ReconciliationAndOrthologues(recon_method, db.ogSet, nHighParallel, nLowParallel, i if qMultiple else None, stride_dups=stride_dups, q_split_para_clades=q_split_para_clades)
File "/Users/user/opt/anaconda3/envs/longenv/bin/scripts_of/orthologues.py", line 870, in ReconciliationAndOrthologues
nOrthologues_SpPair = trees2ologs_of.DoOrthologuesForOrthoFinder(ogSet, species_tree_rooted_labelled, trees2ologs_of.GeneToSpecies_dash,
File "/Users/user/opt/anaconda3/envs/longenv/bin/scripts_of/trees2ologs_of.py", line 1123, in DoOrthologuesForOrthoFinder
nOrthologues_SpPair = RunOrthologsParallel(ta, len(ogSet.speciesToUse), args_queue, n_parallel)
File "/Users/user/opt/anaconda3/envs/longenv/bin/scripts_of/trees2ologs_of.py", line 1276, in RunOrthologsParallel
proc.start()
File "/Users/user/opt/anaconda3/envs/longenv/lib/python3.10/multiprocessing/process.py", line 121, in start
self._popen = self._Popen(self)
File "/Users/user/opt/anaconda3/envs/longenv/lib/python3.10/multiprocessing/context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/Users/user/opt/anaconda3/envs/longenv/lib/python3.10/multiprocessing/context.py", line 288, in _Popen
return Popen(process_obj)
File "/Users/user/opt/anaconda3/envs/longenv/lib/python3.10/multiprocessing/popen_spawn_posix.py", line 32, in __init__
super().__init__(process_obj)
File "/Users/user/opt/anaconda3/envs/longenv/lib/python3.10/multiprocessing/popen_fork.py", line 19, in __init__
self._launch(process_obj)
File "/Users/user/opt/anaconda3/envs/longenv/lib/python3.10/multiprocessing/popen_spawn_posix.py", line 47, in _launch
reduction.dump(process_obj, fp)
File "/Users/user/opt/anaconda3/envs/longenv/lib/python3.10/multiprocessing/reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
TypeError: cannot pickle '_io.TextIOWrapper' object
Edit 2: Seems like it was another issue with version(s). I downgraded Python to python=3.7 to fix the above issue, and I was able to get everything to work. I hope this comment helps!
Hi sdtruong,
I will explained how I fixed my issue:
Orthofinder was built for python v2.7 -> incompatible for python 3.10 installed on hydrogen cluster
refers to the links below
https://github.com/davidemms/OrthoFinder/issues/328 https://github.com/bioconda/bioconda-recipes/pull/20155/files
Check python version, create py27 env, and install python v2.7.6
python --version Python 3.10.4
Create conda env for py27 (set small conda env for each pkgs to keep tidy)
conda create -n py27
conda install -n py27 -c anaconda python=2.7.6 conda activate py27
Install suggested diamond v0.9.24 from Github
conda activate py27 conda install -c bioconda diamond=0.9.24
Install orthofinder on py27 env
conda install -n py27 -c orthofinder
Test orthofinder is properly installed
orthofinder -h
Run orthofinder on ExampleData
cd ~/orthofinder_tutorial/OrthoFinder orthofinder -f ExampleData/
results in /home/cd791/orthofinder_tutorial/OrthoFinder/ExampleData/OrthoFinder/Results_Aug12_1/WorkingDirectory/
Hi David,
I am having the same problem. I tried what cd791 suggested but nothing is working.
I have 9 species and working with DNA, just in case that matter.
Best,
V
Checking required programs are installed
Test can run "mcl -h" - ok
Test can run "fastme -i OrthoFinder/Results_Feb08/WorkingDirectory/SimpleTest.phy -o OrthoFinder/Results_Feb08/WorkingDirectory/SimpleTest.tre" - ok
Traceback (most recent call last):
File "/uufs/chpc.utah.edu/common/home/u6044365/.conda/envs/py27/bin/orthofinder", line 7, in
I have the same problem running orthofinder 2.5.4 on WSL2. For the most part, sdtruong's explanation works for me. The problem seems to originate here:
File "/home/<username>/miniconda3/envs/orthofinder/bin/scripts_of/newick.py", line 208, in read_newick nw = open(newick, 'rU').read()
Since 'U' is not a read mode in python versions above 3.7, downgrading to python 3.7.12 (the most recent version pf python 3.7 at the time of writing) worked for me to solve this problem. No need to downgrade diamond or orthofinder, I was able to run orthofinder 2.5.4 and diamond 2.1.6 (both the most recent versions at the time of writing).
Alternatively you could try fixing the source code to remove 'rU' as a read mode, but I was too lazy.