C. Titus Brown
C. Titus Brown
while I'm thinking about it... it would be nice to move towards genuinely interactive search, gather,and MAGsearch. greyhound #1226 and greyhound.sourmash.bio/ is super cool, of course, but it's "only" searching...
from [benchmarking on 1.2m signature genbank](https://github.com/dib-lab/2020-paper-sourmash-gather/issues/47), even fairly simple environmental metagenomes such as SRR1976948 are matching to lots of redundancy. we're looking at ways to deal with this at a...
Luiz's talk on August 30th, 2022 at the annual JGI User Meeting: https://luizirber.org/talks/2022-08-30-JGI/slides.html
The root question that led to discovering https://github.com/sourmash-bio/sourmash/issues/2318 was that @bluegenes was getting unexpectedly inaccurate results from a zymo mock community analysis, and she wanted to understand why. In brief,...
when @bluegenes was digging into some classification results, she discovered that `gather` was not outputting all of the prefetch results (as evaluated by comparing to `sourmash prefetch -o ...`). (IMO...
from microbial bioinfo slack - >I downloaded the 661K_sourmash_index_scaled.sbt.zip from http://ftp.ebi.ac.uk/pub/databases/ENA2018-bacteria-661k/ and when running the search it says: ERROR: cannot use '661K_sourmash_index_scaled.sbt.zip' for this query. cannot search this SBT for...
in https://github.com/sourmash-bio/sra_search/pull/12, @luizirber defines some functions that seem to me like they would be nice to have in sourmash core - * `fn check_compatible_downsample` - check if two `KerMinHash` are...
https://github.com/dosisod/refurb Luiz says: >Seems like a good pairing for https://github.com/asottile/pyupgrade could also take this as an opportunity to run flake8 across the codebase.
a lot of our ancillary documentation and workflows (genome-grist and spacegraphcats) suggests doing k-mer trimming of metagenomes, e.g. with trim-low-abund. but, for gather in particular, this is not particularly important....
In https://github.com/sourmash-bio/sourmash/pull/2249, we adjusted the gather output and the `kreport` format to provide abundance-weighted results. If you run a gather _without_ abundance, ``` % sourmash gather test1.sig ../../../gtdb-rs207.genomic.k31.zip --picklist test1.gather.csv::gather...