sourmash icon indicating copy to clipboard operation
sourmash copied to clipboard

Quickly search, compare, and analyze genomic and metagenomic data sets.

Results 290 sourmash issues
Sort by recently updated
recently updated
newest added

From https://github.com/sourmash-bio/sourmash/pull/2178, @bluegenes: > More complicated use case that would be _really_ neat to enable: run prefetch against, e.g. genus-level representative database. Then run gather and use the prefetch output...

I don't think we have explicitly written rules for identifiers for folks building custom databases? What characters can and can't be included? For signature names, we consider everything before the...

doc
taxonomy

when we use `sourmash tax annotate` on gather results, we produce a column with semicolon-separated lineages in it. we don't have many (any?) sourmash subcommands that natively ingest that format,...

taxonomy

right now, the `fromfile` format doesn't support a simple way to produce translated sequence - presumably we'd need to add a CDS column or something, or else build workflows (elsewhere)...

Adds a `FrozenSourmashSignature` class, and provides sensible `to_mutable()` and `to_frozen()` methods on `SourmashSignature` and `FrozenSourmashSignature`. Provides an `update()` context manager that wraps changes so that a `FrozenSourmashSignature` is left at...

`python -m sourmash.sig` works `python -m sourmash.tax` doesn't work `python -m sourmash.lca` doesn't work, for different reasons do we want this to work? if so, we should fix and test....

#1045 defaults compute to use the B-Tree impl. Also add a flag in the CLI to choose the Vec one? The Vec one is better in very limited cases (very...

rust

After https://github.com/sourmash-bio/sourmash/pull/1610, it seems like an obvious next set of simplifications is to remove all of the `with x.update(): ...` code blocks and replace them with `flatten` and/or `downsample` calls....

see https://octo-repo-visualization.vercel.app/?repo=sourmash-bio%2Fsourmash https://next.github.com/projects/repo-visualization?utm_source=programmingdigest&utm_medium=email&utm_campaign=432 explains how to add this to github actions.

I'd like to compute and index MinHash sketches on GTDB r202 representive genomes. The sketching step (v4.2.1) is parallelized with 16 or 40 threads on a 160-cores machine. But some...