woltka
woltka copied to clipboard
Which parameters does the gotu command actually uses?
Going over the code is super confusing to know exactly which parameters or values the gotu
command is using. It looks like gotu
calls the classify
command without any parameters (everything is None) and then this calls workflow and then there is a few calls to other functions but with everything None is not clear.
Another way to think about this question is: what's the difference between classify
and gotu
?
@antgonza Thanks for the insightful comments! gotu
is a minimal subset of classify
, i.e., no classification; just assign queries to subjects but not to higher classification units. So it does not need most of the parameters of the classify
command. In this program, gotu
and classify
shares the same workflow to ensure comparability between results.
The functions being called will return None when parameters are None. For example, two main settings differentiates a gotu workflow from a taxonomic classification workflow: whether there is a classification system (loaded by --nodes
or --lineages
etc.), whether the target rank (--rank
) is none or a rank name (e.g., "species"). Despite the difference, the two workflows are mutually identical in logic.
PS: in the program design, the entire classification system is tree-like, with tips as subject IDs. Therefore a classification system without higher hierarchies but only subjects === gOTU.
Thank you for the explanation; perhaps worth adding this information to the documentation and clearly list all the parameters used for each command.