C. Titus Brown

Results 518 issues of C. Titus Brown

on farm at `/home/ctbrown/scratch/2022-branchwater-benchmarking/wort-list-d.d/Snakefile` - ```python # convert a bunch of .sig files into .sig.zip files and also produce .mf.csv fil es. FILELIST='../data/wort-list-d.txt' siglist = [ x.strip() for x in...

code

`__version__` (as a string) is more standard; mentioned in PEP 8 and PEP 440. see https://stackoverflow.com/a/459185 for links/discussion.

**Note: not yet final** v0.2: https://github.com/AllTheBacteria/AllTheBacteria/releases/tag/v0.2 sourmash databases constructed by @ccbaumler 🎉 * [allthebacteria-v0.2-k21.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db/allthebacteria-v0.2/allthebacteria-r0.2-k21.zip) - 62 GB * [allthebacteria-v0.2-k31.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db/allthebacteria-v0.2/allthebacteria-r0.2-k31.zip) - 62 GB * [allthebacteria-v0.2-k51.zip](https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db/allthebacteria-v0.2/allthebacteria-r0.2-k51.zip) - 62 GB **Note:** The databases...

more info to go here @ccbaumler :)

I'm trying to sketch the [RVDB, the Reference Viral Genome Database](https://rvdb.dbi.udel.edu/). The clustered file is ~600 MB. ``` sourmash scripts manysketch C-RVDBvCurrent.manysketch.csv -o C-RVDBvCurrent.manysketch.zip -p dna,k=21,scaled=1000 --singleton ``` took about...

Fixes `sig kmers` to properly ignore bad DNA unless `--check-sequence` is provided; adds tests. Fixes https://github.com/sourmash-bio/sourmash/issues/2842.

Over in oxli-bio/oxli, @adamtaranto and I have been implementing k-mer functionality (no sketching!) on top of `SeqToHashes` in Rust, and then exposing it to Python via pyo3. Most recently, we...

rust

Currently, `RevIndex` only supports a single `Collection` for use as external storage. This limits it to things like Zip files and .sig.gz files, and maybe manifests and pathlists of .sig.gz...

rust

@dkoslicki asks on matrix: >quick question: is sourmash sketch dna now multithreaded in 4.8.11? My understanding was that it was single threaded, but I just noticed via top and time...

[extract-max-extent-around-hashes.py](https://github.com/ctb/2022-sourmash-sens-spec/blob/main/scripts/extract-max-extent-around-hashes.py) is a potentially useful script that does the following: Given a query sig, and a target set of contigs, extract the maximum extent of actual DNA sequence from the...

code