phylophlan
phylophlan copied to clipboard
[e] expected str, bytes or os.PathLike object, not NoneType
Hi, I have downloaded Phylophlan version 3.0.60 via conda and following example 1 (S.aureus) tutorial on github.
This is the command I am using:
phylophlan -i input_genomes -d phylophlan --diversity medium -f isolates_config.cfg --verbose
This is the error message I am getting:
PhyloPhlAn version 3.0.60 (27 November 2020)
Command line: /opt/miniconda3/bin/phylophlan -i input_genomes -d phylophlan --diversity medium -f isolates_config.cfg --verbose
Automatically setting "input=input_genomes" and "input_folder=/home/ubuntu"
"medium-accurate" preset
Setting "sort=True" because "database=phylophlan"
Setting "min_num_markers=100" since no value has been specified and the "database=phylophlan"
Arguments: {'input': 'input_genomes', 'clean': None, 'output': 'input_genomes_phylophlan', 'database': 'phylophlan', 'db_type': None, 'config_file': 'isolates_config.cfg', 'diversity': 'medium', 'accurate': True, 'fast': False, 'clean_all': False, 'database_list': False, 'submat': 'pfasum60', 'submat_list': False, 'submod_list': False, 'nproc': 1, 'min_num_proteins': 1, 'min_len_protein': 50, 'min_num_markers': 100, 'trim': 'gap_trim', 'gap_perc_threshold': 0.67, 'not_variant_threshold': 0.99, 'subsample': <function onehundred at 0x7f03d9380680>, 'unknown_fraction': 0.3, 'scoring_function': <function trident at 0x7f03d93809e0>, 'sort': True, 'remove_fragmentary_entries': False, 'fragmentary_threshold': 0.85, 'min_num_entries': 4, 'maas': None, 'remove_only_gaps_entries': False, 'mutation_rates': False, 'force_nucleotides': False, 'input_folder': '/home/ubuntu/input_genomes', 'data_folder': 'input_genomes_phylophlan/tmp', 'databases_folder': 'phylophlan_databases/', 'submat_folder': '/opt/miniconda3/lib/python3.7/site-packages/phylophlan/phylophlan_substitution_matrices/', 'submod_folder': '/opt/miniconda3/lib/python3.7/site-packages/phylophlan/phylophlan_substitution_models/', 'configs_folder': '/opt/miniconda3/lib/python3.7/site-packages/phylophlan/phylophlan_configs/', 'output_folder': '', 'genome_extension': '.fna', 'proteome_extension': '.faa', 'update': False, 'verbose': True}
Loading configuration file "isolates_config.cfg"
Checking configuration file
Checking "/opt/miniconda3/bin/diamond"
Checking "/opt/miniconda3/bin/mafft"
Checking "/opt/miniconda3/bin/trimal"
Checking "/opt/miniconda3/bin/FastTreeMP"
Checking "/opt/miniconda3/bin/raxmlHPC-PTHREADS-SSE3"
"db_aa" database "phylophlan_databases/phylophlan/phylophlan.dmnd" present
Loading files from "/home/ubuntu/input_genomes"
Inputs already cleaned
Loading files from "input_genomes_phylophlan/tmp/clean_dna"
"phylophlan" markers already mapped (key: "map_dna")
Selecting 84 markers from "input_genomes_phylophlan/tmp/map_dna"
Selecting "input_genomes_phylophlan/tmp/map_dna/1007-1-F#8.b6o.bkp"
[e] expected str, bytes or os.PathLike object, not NoneType
[e] gene_markers_selection crashed
Any help would be appreciated.
Hi and thanks for reporting this.
From these lines:
Inputs already cleaned
Loading files from "input_genomes_phylophlan/tmp/clean_dna"
"phylophlan" markers already mapped (key: "map_dna")
it seems it is not the first time you ran PhyloPhlAn for that input folder. I'm wondering whether the error you got could be due to a previous run that was stopped or encountered other problems. So, it would be great if you can remove the output folder and re-run the command above (you can add 2>&1 | tee phylophlan.txt
at the end of your command to write the output to a log file and send it here).
Many thanks, Francesco
Hi, thank you for the quick reply. I have removed the output folder and reran the command and it got further than previous.
However a new error is now being created. I have attached the output file.
Thank you phylophlan.txt
Hi and thanks for reporting this.
I think the problem here is that you might have created the config file using the --force_nucleotides
param, but did not specify it when running PhyloPhlAn. Looking at the log PhyloPhlAn is converting your genomes into proteomes after mapping them against the 400 universal proteins in the phylophlan
database. But the FastTree command at the end is the one for a nucleotides multiple sequence alignment and not a amino acids one:
/.../FastTreeMP [..] -gtr -nt [..]
My suggestion here is to remove the output folder and re-run PhyloPhlAn specifying the --force_nucleotides
param. Or, if you want the pipeline to use the proteomes, you should generate a new config file accordingly and re-run PhyloPhlAn again.
Please, let me know if something is not clear.
Many thanks, Francesco