C. Titus Brown

Results 624 issues of C. Titus Brown

In https://github.com/sourmash-bio/sourmash/issues/2154, we've been talking about how to include taxonomic information in zipfiles, and I've been trying to figure out how that would work at the command line. But all...

taxonomy

In https://github.com/sourmash-bio/sourmash/pull/2195, we added generic support for gzipped CSVs. In most of the codebase, we use `FileInputCSV` which will auto-detect gzipped files, but this is not yet usable by the...

In https://github.com/sourmash-bio/sourmash/pull/2204, we added `--fail-on-empty-database/--no-fail-on-empty-database`, with the default being `--fail`. For v5, we should consider changing this to `--no-fail`.

5.0

This PR supports automatic detection/loading of taxonomies from zip files (including zip databases), when the taxonomy file is named `SOURMASH-TAXONOMY.csv`. Fixes https://github.com/sourmash-bio/sourmash/issues/2012

there is some interest in translating between taxonomies (GTDB, NCBI, and maybe LINS), and this is something that we should be able to do somewhat straightforwardly in sourmash. relevant issues...

taxonomy

conversation with @bluegenes - >for sourmash gather, we could (a) set scaled automatically based on the threshold-bp (whether default or specified by user). This would be a move towards “automatic”...

speeding-up-gather

from STAMPS 2022 - https://hackmd.io/vYaK2UngTWSkKmcpQMP6NA associated presentations and tutorials - [taylor on assembly and binning](https://github.com/mblstamps/stamps2022/blob/main/assembly_and_binning/20220725_stamps2022_assembly_and_binning_full.pdf) [tutorial on assembly and binning with ATLAS](https://github.com/mblstamps/stamps2022/blob/main/assembly_and_binning/tutorial_assembly_and_binning.md) [titus on assembly free analysis with k-mers](https://github.com/mblstamps/stamps2022/blob/main/kmers_and_sourmash/2022-stamps-assembly-free%20class.pdf) [titus...

doc

Over in https://github.com/sourmash-bio/sourmash/issues/2128, I wrote >>Also while I am at it, any speed/memory advantage using FrozenHashes? >Not a big one. The main goal of FrozenMinHash is to enable future optimizations...

when I first implemented the LCA stuff, I was young and foolish and thought there was a point to allowing flexibility in removing identifier versions - e.g. converting `GCF_000422665.1` into...

5.0
6.0

First, it doesn't pick up the k-mer size from the available signatures, so you have to specify it explicitly on the command line (unlike `sourmash index`, where you only need...