charcoal
charcoal copied to clipboard
provide rankings of identification, removals, and so on
inspired by https://github.com/dib-lab/charcoal/issues/94, I'm not (yet) sure how to develop true confidence values, but we could certainly provide rankings. for example,
- least confident tax identification - this would be some sort of composite score of f_match and f_ident that let us rank genomes by number of hashes that were the right level
- most/least confident removals of contigs - based presumably on number of hashes, and maybe "strength" of signal?
the goal would simply be to highlight our relative confidence of interpretation.
or we could engage with real statistics, too :). I wonder if @ACharbonneau is interested?