epa-ng icon indicating copy to clipboard operation
epa-ng copied to clipboard

FastTree

Open atoselan opened this issue 1 year ago • 4 comments

Hi, I'm just setting up a pipeline to place query sequences onto a phylogentic tree of ~5,000 reference sequences. I'm finding Raxml to be very slow for tree building and wondered whether epa-ng supports the use of FastTree. Will this be a problem when I have to supply the model parameters to epa-ng? I've looked online but haven't found any examples of this.

Cheers, Andrew

atoselan avatar Mar 21 '23 16:03 atoselan

Hey @atoselan,

as far as I am aware, as long as you get a newick tree and its model parameters in some form that EPA-ng understands, that should work. See here for the specifications of model params that EPA-ng expects.

I don't know in which format FastTree outputs its parameters. If they are not in that format, once you have the newick file from FastTree, you can use RAxML-ng to obtain the model parameters for it, which will not run the whole tree search, but only give you these params, as explained in the above link as well.

Hope that helps Lucas

lczech avatar Mar 21 '23 18:03 lczech

Maybe, for a 5000 sequences reference tree you should also take the tree inference uncertainty into account, see for instance here:

https://academic.oup.com/mbe/article/38/5/1777/6030946

and also our new tool for predicting the difficulty of a phylogenetic analysis:

https://academic.oup.com/mbe/article/39/12/msac254/6832260

Alexis

On 21.03.23 19:27, Lucas Czech wrote:

Hey @atoselan https://github.com/atoselan,

as far as I am aware, as long as you get a newick tree and its model parameters in some form that EPA-ng understands, that should work. See here https://github.com/pierrebarbera/epa-ng#setting-the-model-parameters for the specifications of model params that EPA-ng expects.

I don't know in which format FastTree outputs its parameters. If they are not in that format, once you have the newick file from FastTree, you can use RAxML-ng to obtain the model parameters for it, which will not run the whole tree search, but only give you these params, as explained in the above link as well.

Hope that helps Lucas

— Reply to this email directly, view it on GitHub https://github.com/pierrebarbera/epa-ng/issues/47#issuecomment-1478394341, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGXB6RJKEGOAK7HQDFVFMLW5HXKZANCNFSM6AAAAAAWCU7EMI. You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- Alexandros (Alexis) Stamatakis

ERA Chair, Institute of Computer Science, Foundation for Research and Technology - Hellas Research Group Leader, Heidelberg Institute for Theoretical Studies Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology

www.biocomp.gr (Crete lab) www.exelixis-lab.org (Heidelberg lab)

stamatak avatar Mar 22 '23 06:03 stamatak

Many thanks, it turned out to be straight-forward to get a fasttree info file using raxml. I know this is a different issue but I can't get papara to work, I get a vague error message about inconsistency in the alignment which I can't resolve. I've started looking at using hmmer/hmmalign instead and wondered if you had any examples to follow for using this approach. Hmmer and hmmalign work fine but how to I ensure that I have alignments of the same length? I've created a hmm from the reference alignment, aligned the queries to the hmm but now the alignments are of different lengths.

atoselan avatar Mar 23 '23 10:03 atoselan

Hm, if I recall correctly, hmmer/hmmalign uses a flag -m to keep the length. I'd check their manual :-)

lczech avatar Mar 30 '23 08:03 lczech