C. Titus Brown
C. Titus Brown
* on farm head node, not on my laptop * easy to replicate - just follow release procedure install 😆 * no CPU usage, just hangs happens just after this:...
these are the latest versions of pytest and pytest-xdist.
no longer happening & not reproducible. 🤷
here are some pretty pictures:
ref https://github.com/sourmash-bio/sourmash/issues/348
adding the `upset` command via the betterplot plugin in https://github.com/sourmash-bio/sourmash_plugin_betterplot/pull/35 - it produces figures like this: 
is this the same or a related issue? https://github.com/sourmash-bio/sourmash/issues/3398
my intuition suggests that, as long as the hashes are entirely independent (i.e. come from non-overlapping k-mers), this should be very reliable. @dkoslicki, we'll pass you the long-read paper results...
I put together some notebooks on detection _sensitivity_ [for 100bp reads](https://github.com/ctb/2022-sourmash-sens-spec/blob/main/detection.ipynb) and [for 10kb reads](https://github.com/ctb/2022-sourmash-sens-spec/blob/main/detection-10kb-reads.ipynb). As expected, genomes can be detected very _sensitively_ at _very, very_ low coverage. I think...
long read paper: Evaluation of taxonomic classification and profiling methods for long-read shotgun metagenomic sequencing datasets: https://www.biorxiv.org/content/10.1101/2022.01.31.478527v2 Relevant Figure 4 showing that sourmash has weirdly good performance 😆 -