pan-genome-analysis
pan-genome-analysis copied to clipboard
Issue with step 8
Hello guys, how r u? I'm having a problem with step08. I get the following error message:
====== starting step08: run fasttree and raxml for tree construction
fasttree time-cost: 1.45 minutes (87.06 seconds)
RAxML tree optimization within the timelimit of 30 minutes
RAxML branch length optimization and rooting
Traceback (most recent call last):
File "./panX.py", line 303, in <module>
myPangenome.build_core_tree()
File "/home/julian/pan-genome-analysis/scripts/pangenome_computation.py", line 200, in build_core_tree
aln_to_Newick(self.path, self.folders_dict, self.raxml_max_time, self.raxml_path, self.threads)
File "/home/julian/pan-genome-analysis/scripts/sf_core_tree_build.py", line 75, in aln_to_Newick
shutil.copy('RAxML_result.branches', out_fname)
File "/home/julian/miniconda2/envs/panX/lib/python2.7/shutil.py", line 119, in copy
copyfile(src, dst)
File "/home/julian/miniconda2/envs/panX/lib/python2.7/shutil.py", line 82, in copyfile
with open(src, 'rb') as fsrc:
IOError: [Errno 2] No such file or directory: 'RAxML_result.branches'
I checked the raxml log and I found this:
Option -T does not have any effect with the sequential or parallel MPI version. It is used to specify the number of threads for the Pthreads-based parallelization
RAxML can't, parse the alignment file as phylip file it will now try to parse it as FASTA file
ERROR: Sequence AF-673 consists entirely of undetermined values which will be treated as missing data ERROR: Sequence CMC-MDR-Ab59 consists entirely of undetermined values which will be treated as missing data ERROR: Sequence HRAB-85 consists entirely of undetermined values which will be treated as missing data ERROR: Sequence KAB07 consists entirely of undetermined values which will be treated as missing data ERROR: Found 4 sequences that consist entirely of undetermined values, exiting...
So, I figured there might be a problem with the fasta files that gets generated in the previous steps. Any ideas on how to fix this?
This probably means that your core genome is empty. Are your genomes incomplete? or very diverse?
Hi Richard. Thx for your prompt response. The genomes are complete. In regards to diversity, they are not clonal strains. But all geomes belong to the same bacterial species.
I have faced the same issue at step 8. In my case, I am analysing phage genomes, each of which is small in size. Some of them are close to each other, but others are further apart. I had success until step 7 running it with -cb 0.3
. Is there anything I can do to make it complete the next steps? I can't generate a complete set of files for pan-genome-visualization if I am stuck at step 7.
Any recommendations would be very welcome. Both panX and pan-genome-visualization are great tools that are making my analysis a lot easier and very detailed.
Sorry I dropped the ball here. I was traveling when this one came in and it fell through the cracks. So you say step 7 (greating a SNP alignment from the core genome) completed with -cg 0.3
but step 8 (core genome tree) failed? Could you give some more info on what was written to the log (the panX log and/or the RAXML/fasttree logs). Did step 7 produce a file "geneCluster/SNP_whole_matrix.aln"?