AvP icon indicating copy to clipboard operation
AvP copied to clipboard

Create HGTrobustness_parsing

Open CaroleBelliardo opened this issue 4 months ago • 0 comments

add HGTrobustness_parsing This script, HGTrobustness_parsing.py, is designed to Facilitate a phylogenetic validation of putative Horizontal Gene Transfer (HGT) events based on a comprehensive analysis involving protein counts and taxonomic ratios within sister branches of phylogenetic trees.

Usage It is run from the command line and requires several arguments:

--fasttree_tree_results or -a: This is the file path to fasttree_tree_results from AvP results. This argument is required.

--fasttree or -t: This is the directory path containing trees in Newick format. This argument is required.

--nb_prot or -n: This is the total number of proteins in the sister branch and ancestral sister branch. This argument is optional and defaults to 3 if not provided.

--clade_ratio or -n: This is the clade ratio, which is the total number of proteins in the sister branch and ancestral sister branch. This argument is optional and defaults to 0.8 if not provided.

--output_names or -o: This is the name of the output files. This argument is required.

The script also includes a function import_tree that imports a tree structure from a Newick file and roots the tree at the midpoint. This function takes as an argument the name of the file containing the tree structure in Newick format.

To run the script, navigate to the directory containing the script and use the following command, replacing the argument values with those appropriate for your use case:

$ python Parse_tree_2024.py --fasttree_tree_results [your_results_file/file path] --fasttree [your_tree_directory/dir path] --nb_prot [your_protein_number/int] --clade_ratio [your_clade_ratio/float] --output_names [your_output_file_name/path]

CaroleBelliardo avatar Feb 26 '24 13:02 CaroleBelliardo