taxize
taxize copied to clipboard
`downsteam` misses descendents of ambiguous clades
This call
taxize::downstream('Brassicaceae', db='ncbi', downto='genus')
loses all descendents of the clade 'Brassicaceae incertae sedis'. I believe this clade gets filtered out as ambiguous when children
is called on 'Brassicaceae'.
In this particular case, I believe the species in 'Brassicaceae incertae sedis' are predicted to be in 'Brassicaceae', but their particular location in the family is unknown. So I think they should be included as downstream of 'Brassicaceae'.
We can pass the ambiguous=TRUE
argument to ncbi_downstream
. This in turn should, according to the documentation, be passed to ncbi_children
.
taxize::downstream('Brassicaceae', db='ncbi', downto='genus', ambiguous=TRUE)
However the results are the same, so I think the argument is not getting passed.
Also, it is reasonable that a user might want to keep ambiguous nodes but filter ambiguous species. I would suggest adding two new arguments to ncbi_downstream
: ambiguous_nodes=TRUE
and ambiguous_species=FALSE
. I am not certain, though, if keeping ambiguous nodes is the right thing to do by default.
I think I fixed the argument passing issue. But there is still the question of whether there should be special handling of ambiguous nodes.
thanks, will take a look tomorrow to familiarize myself with it, can't make educ. opinion right now (boarding 🛫 soon)
No worries, have a nice flight!
the merge closed this, but you did say it partially addresses this, i assume you want this to remain open, yes?
Yeah, there is just the matter of whether we want to be able to handle ambiguous nodes and ambiguous species differently.
yeah, will reopen
@arendsee any further thoughts on
whether we want to be able to handle ambiguous nodes and ambiguous species differently.
@sckott Nothing particularly new. taxizedb
can distinguish ambiguous nodes and ambiguous species, e.g.
taxizedb::downstream(3700, downto='genus', ambiguous_nodes=FALSE, ambiguous_species=TRUE)
Adding the same arguments to taxize
might be good. Although I am not sure whether the break from the existing API is worth the gain in control.
bumping this to next milestone