pan-genome-analysis Issue with step 8

Hello guys, how r u? I'm having a problem with step08. I get the following error message:

======  starting step08: run fasttree and raxml for tree construction
 fasttree time-cost:  1.45 minutes (87.06 seconds)
RAxML tree optimization within the timelimit of 30 minutes
RAxML branch length optimization and rooting
Traceback (most recent call last):
  File "./panX.py", line 303, in <module>
    myPangenome.build_core_tree()
  File "/home/julian/pan-genome-analysis/scripts/pangenome_computation.py", line 200, in build_core_tree
    aln_to_Newick(self.path, self.folders_dict, self.raxml_max_time, self.raxml_path, self.threads)
  File "/home/julian/pan-genome-analysis/scripts/sf_core_tree_build.py", line 75, in aln_to_Newick
    shutil.copy('RAxML_result.branches', out_fname)
  File "/home/julian/miniconda2/envs/panX/lib/python2.7/shutil.py", line 119, in copy
    copyfile(src, dst)
  File "/home/julian/miniconda2/envs/panX/lib/python2.7/shutil.py", line 82, in copyfile
    with open(src, 'rb') as fsrc:
IOError: [Errno 2] No such file or directory: 'RAxML_result.branches'

I checked the raxml log and I found this:

Option -T does not have any effect with the sequential or parallel MPI version. It is used to specify the number of threads for the Pthreads-based parallelization

RAxML can't, parse the alignment file as phylip file it will now try to parse it as FASTA file

ERROR: Sequence AF-673 consists entirely of undetermined values which will be treated as missing data ERROR: Sequence CMC-MDR-Ab59 consists entirely of undetermined values which will be treated as missing data ERROR: Sequence HRAB-85 consists entirely of undetermined values which will be treated as missing data ERROR: Sequence KAB07 consists entirely of undetermined values which will be treated as missing data ERROR: Found 4 sequences that consist entirely of undetermined values, exiting...

So, I figured there might be a problem with the fasta files that gets generated in the previous steps. Any ideas on how to fix this?

Apr 10 '19 13:04 jpaganini

This probably means that your core genome is empty. Are your genomes incomplete? or very diverse?

Apr 10 '19 14:04 rneher

Hi Richard. Thx for your prompt response. The genomes are complete. In regards to diversity, they are not clonal strains. But all geomes belong to the same bacterial species.

Apr 10 '19 15:04 jpaganini

I have faced the same issue at step 8. In my case, I am analysing phage genomes, each of which is small in size. Some of them are close to each other, but others are further apart. I had success until step 7 running it with -cb 0.3. Is there anything I can do to make it complete the next steps? I can't generate a complete set of files for pan-genome-visualization if I am stuck at step 7.

Any recommendations would be very welcome. Both panX and pan-genome-visualization are great tools that are making my analysis a lot easier and very detailed.

Oct 04 '19 09:10 avilella

Sorry I dropped the ball here. I was traveling when this one came in and it fell through the cracks. So you say step 7 (greating a SNP alignment from the core genome) completed with -cg 0.3 but step 8 (core genome tree) failed? Could you give some more info on what was written to the log (the panX log and/or the RAXML/fasttree logs). Did step 7 produce a file "geneCluster/SNP_whole_matrix.aln"?

Nov 09 '19 13:11 rneher

pan-genome-analysis pan-genome-analysis copied to clipboard

Issue with step 8

pan-genome-analysis
pan-genome-analysis copied to clipboard