Tessa Pierce Ward

Results 75 issues of Tessa Pierce Ward

`sourmash compare` outputs a similarity matrix. Many tools want the opposite, a distance matrix. It would be relatively straightforward to support `--distance` (outputs 1-value). This would work for jaccard, containment,...

enhancement

`sourmash tax genome --force` fails silently or yields confusing error if you pass in multiple gather results for the same query I accidentally passed in both k7 and k10 gather...

taxonomy

I think I mentioned this as part of another issue, but I think it would make taxonomy analyses a bit simpler if we included our taxonomy csv (optionally formatted as...

taxonomy

When running `sourmash compare`, the matrix will be asymmetric. But which direction is which? And is it the same for `containment` and `containment-ani`? Based on code here: https://github.com/sourmash-bio/sourmash/blob/latest/src/sourmash/compare.py#L75-L90 For `--containment`,...

doc

I'm having trouble matching some of my protein sigs to the rs207 taxonomy, and I'm noticing some mismatches in `GCA` vs `GCF` identifiers. Looking at a taxonomy csv ``` grep...

I don't think we have explicitly written rules for identifiers for folks building custom databases? What characters can and can't be included? For signature names, we consider everything before the...

doc
taxonomy

- [x] multiple user databases - [x] use DAMMIT_DB_DIR environmental variable ** commented out test for failing if db_dir is incorrect, as `annotate` will currently just install databases if they...

- added benchmark directive to each rule, using `benchmarks_dir` variable - found a couple rules where logs were being put into `results_dir` -- fixed to `logs_dir`. - removed comment about...