epa-ng
epa-ng copied to clipboard
FastTree
Hi, I'm just setting up a pipeline to place query sequences onto a phylogentic tree of ~5,000 reference sequences. I'm finding Raxml to be very slow for tree building and wondered whether epa-ng supports the use of FastTree. Will this be a problem when I have to supply the model parameters to epa-ng? I've looked online but haven't found any examples of this.
Cheers, Andrew
Hey @atoselan,
as far as I am aware, as long as you get a newick tree and its model parameters in some form that EPA-ng understands, that should work. See here for the specifications of model params that EPA-ng expects.
I don't know in which format FastTree outputs its parameters. If they are not in that format, once you have the newick file from FastTree, you can use RAxML-ng to obtain the model parameters for it, which will not run the whole tree search, but only give you these params, as explained in the above link as well.
Hope that helps Lucas
Maybe, for a 5000 sequences reference tree you should also take the tree inference uncertainty into account, see for instance here:
https://academic.oup.com/mbe/article/38/5/1777/6030946
and also our new tool for predicting the difficulty of a phylogenetic analysis:
https://academic.oup.com/mbe/article/39/12/msac254/6832260
Alexis
On 21.03.23 19:27, Lucas Czech wrote:
Hey @atoselan https://github.com/atoselan,
as far as I am aware, as long as you get a newick tree and its model parameters in some form that EPA-ng understands, that should work. See here https://github.com/pierrebarbera/epa-ng#setting-the-model-parameters for the specifications of model params that EPA-ng expects.
I don't know in which format FastTree outputs its parameters. If they are not in that format, once you have the newick file from FastTree, you can use RAxML-ng to obtain the model parameters for it, which will not run the whole tree search, but only give you these params, as explained in the above link as well.
Hope that helps Lucas
— Reply to this email directly, view it on GitHub https://github.com/pierrebarbera/epa-ng/issues/47#issuecomment-1478394341, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGXB6RJKEGOAK7HQDFVFMLW5HXKZANCNFSM6AAAAAAWCU7EMI. You are receiving this because you are subscribed to this thread.Message ID: @.***>
-- Alexandros (Alexis) Stamatakis
ERA Chair, Institute of Computer Science, Foundation for Research and Technology - Hellas Research Group Leader, Heidelberg Institute for Theoretical Studies Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology
www.biocomp.gr (Crete lab) www.exelixis-lab.org (Heidelberg lab)
Many thanks, it turned out to be straight-forward to get a fasttree info file using raxml. I know this is a different issue but I can't get papara to work, I get a vague error message about inconsistency in the alignment which I can't resolve. I've started looking at using hmmer/hmmalign instead and wondered if you had any examples to follow for using this approach. Hmmer and hmmalign work fine but how to I ensure that I have alignments of the same length? I've created a hmm from the reference alignment, aligned the queries to the hmm but now the alignments are of different lengths.
Hm, if I recall correctly, hmmer/hmmalign uses a flag -m
to keep the length. I'd check their manual :-)