Using Nucleotide vs Amino Acid fastas as input
Hi, Thank you for building this tool. It is really great and quite user friendly.
What are the advantages or disadvantages of using nucleotide sequences as the input to orthofinder versus using amino acid sequences? The -d flag is not mentioned in the manuscript and minimally in the wiki. I am specifically curious about getting the most reliable orthogroups, as trees and MSAs can be re-made with stand-alone programs, but I am also curious to hear thoughts about the pros and cons of the two input types for any of these desired outcomes.
Linking to a similar question where the recommendation appears to be that proteins are preferable, but both approaches can be useful for the specific case of phylogenetic tree construction (#628).
I have a question relevant to this topic. I try to find orthologs across multiple chromosome-level reference genomes (downloaded from NCBI). Shall I use the reference genome as input with -d? Or should I use the annotations for the genome?