Phylo.jl icon indicating copy to clipboard operation
Phylo.jl copied to clipboard

editing leaf names

Open shaul-pollak opened this issue 3 years ago • 4 comments

Hi! Thanks for this awesome package! Is it possible to change the leaf names of the tree after it was created? I couldn't find this functionality currently Thanks!

shaul-pollak avatar Apr 08 '21 16:04 shaul-pollak

Hmm. No, not at the moment. Can I ask why you want to do that?

richardreeve avatar Apr 08 '21 20:04 richardreeve

of course! for instance, when im reading a tree that is the output of some program, it may have a very complicated name like GCF_somethingsomething.1_ASMsomethingsomehing this is hard for my puny human mind. i want to change it to just GCF_something. This happens quite a lot when working with external databases like NCBI or JGI Thanks for the quick answer!

shaul-pollak avatar Apr 08 '21 20:04 shaul-pollak

Fair enough. I'll have a think about whether there's an easy way of doing it...

richardreeve avatar Apr 08 '21 20:04 richardreeve

thank you!!

shaul-pollak avatar Apr 08 '21 20:04 shaul-pollak

Sorry it has taken so long to get around to this, @shaul-pollak. I think I have a workaround on the off chance you're still interested. As of v0.5.1, we can export trees and so we can export and reimport through a newick tree to solve your problem, because for various nexus-format-related reasons the exported tree can have different node names than the tree in memory.

So:

julia> using Random, Phylo

julia> tree = rand(Nonultrametric(10));

julia> nn = getnodenames(tree);

julia> d = Dict(nn .=> nn);

julia> d["tip 1"] = "first tip"
"first tip"

julia> new_tree = parsenewick(Phylo.outputtree(t, Newick(d)));

julia> getleafnames(new_tree)
10-element Vector{String}:
 "tip 8"
 "tip 3"
 "tip 5"
 "tip 7"
 "tip 9"
 "first tip"
 "tip 6"
 "tip 4"
 "tip 2"
 "tip 10"

You can also skip some nodes and they'll be given default names as they are read in:

julia> ln = getleafnames(tree);

julia> d2 = Dict(ln .=> string.(Ref("newer "), ln))
Dict{String, String} with 10 entries:
  "tip 7"  => "newer tip 7"
  "tip 4"  => "newer tip 4"
  "tip 8"  => "newer tip 8"
  "tip 1"  => "newer tip 1"
  "tip 9"  => "newer tip 9"
  "tip 2"  => "newer tip 2"
  "tip 10" => "newer tip 10"
  "tip 6"  => "newer tip 6"
  "tip 5"  => "newer tip 5"
  "tip 3"  => "newer tip 3"

julia> newer_tree = parsenewick(Phylo.outputtree(t, Newick(d2)));

julia> getnodenames(newer_tree)
19-element Vector{String}:
 "Node 19"
 "Node 18"
 "Node 16"
 "newer tip 4"
 "newer tip 2"
 "newer tip 10"
 "Node 13"
 "Node 12"
 "newer tip 1"
 "newer tip 6"
 "Node 9"
 "Node 8"
 "newer tip 9"
 "newer tip 7"
 "Node 5"
 "newer tip 8"
 "Node 4"
 "newer tip 3"
 "newer tip 5"

Anyway, that's about as good as it gets at the moment. I don't know if it's still any use to you, but I thought I'd mention it just in case.

richardreeve avatar Dec 20 '23 02:12 richardreeve

Okay, renamenode!(tree, node, "new name") now works for some tree types in #91. It'll return true if it succeeds (it may fail even for supported tree types if the new name is a duplicate), and false if it fails or the tree type is not supported. The new RecursiveTree types are able to rename nodes (so long as there is no leaf information, or it's a Dict, which includes the default types like RootedTree).

richardreeve avatar Jan 05 '24 13:01 richardreeve