Tessa Pierce Ward
Tessa Pierce Ward
`sourmash compare` outputs a similarity matrix. Many tools want the opposite, a distance matrix. It would be relatively straightforward to support `--distance` (outputs 1-value). This would work for jaccard, containment,...
`sourmash tax genome --force` fails silently or yields confusing error if you pass in multiple gather results for the same query I accidentally passed in both k7 and k10 gather...
I think I mentioned this as part of another issue, but I think it would make taxonomy analyses a bit simpler if we included our taxonomy csv (optionally formatted as...
When running `sourmash compare`, the matrix will be asymmetric. But which direction is which? And is it the same for `containment` and `containment-ani`? Based on code here: https://github.com/sourmash-bio/sourmash/blob/latest/src/sourmash/compare.py#L75-L90 For `--containment`,...
I'm having trouble matching some of my protein sigs to the rs207 taxonomy, and I'm noticing some mismatches in `GCA` vs `GCF` identifiers. Looking at a taxonomy csv ``` grep...
I don't think we have explicitly written rules for identifiers for folks building custom databases? What characters can and can't be included? For signature names, we consider everything before the...
- [x] multiple user databases - [x] use DAMMIT_DB_DIR environmental variable ** commented out test for failing if db_dir is incorrect, as `annotate` will currently just install databases if they...
- added benchmark directive to each rule, using `benchmarks_dir` variable - found a couple rules where logs were being put into `results_dir` -- fixed to `logs_dir`. - removed comment about...