bioperl-live icon indicating copy to clipboard operation
bioperl-live copied to clipboard

Bio::TreeIO produces illegally formatted phyloXML

Open cmzmasek opened this issue 6 years ago • 1 comments

The phyloXML produced by Bio::TreeIO does not conform to the standard. Basically, when executing a script like the one below, the "phyloXML" formatted output has elements in the wrong order: according to the standard (see: http://www.phyloxml.org/documentation/version_1.20/phyloxml.html ) "name" comes first in a "clade", followed by "branch_length", "confidence". Other "clade" elements come at the very end.

use Bio::TreeIO;

$infile = "t2.txt"; t2.txt

my $treeio = Bio::TreeIO->new(-format => 'newick', -file => $infile);

my $tree = $treeio->next_tree;

for my $node ( $tree->get_nodes ) { printf "id: %s branchlength: %s bootstrap: %s\n", $node->id || '', $node->branch_length || '', $node->bootstrap || '', "\n"; }

my $outfile = "outfile.xml"; my $newio = Bio::TreeIO->new (-format => 'phyloxml', -file=>">$outfile"); $newio->write_tree($tree);

cmzmasek avatar Jul 17 '17 21:07 cmzmasek

@cmzmasek that is very possible, the code for this was written up many moons ago during the GSoC so even if it were compliant then it may not be now.

My recommendation on this is that we pull out the phyloXML code to a new repository where it can be worked on independently of the main bioperl release. We could then set up tests for issues like this. The main question, once that transition is made, is having someone take this on.

cjfields avatar Dec 19 '17 20:12 cjfields