C. Titus Brown

Results 979 comments of C. Titus Brown
trafficstars

> Also, the column name is actually `potential_false_negative`, is it supposed to be `potential_false_positive`? hmm, I ... don't know. Are we doing double negatives here? 😭 @bluegenes your thoughts welcome!

hi @Amanda-Biocortex, the calculation is published here: Deriving confidence intervals for mutation rates across a wide range of evolutionary distances using FracMinHash https://genome.cshlp.org/content/33/7/1061 (preprint here: https://www.biorxiv.org/content/10.1101/2022.01.11.475870v4) My recollection is that...

> I believe 95% ANI threshold is standard- would this be the same for Sourmash ANI? 95% is usually used for species cutoffs between two genomes. sourmash's containment ANI is...

I'll look into it! It should be reporting both matches. Thank you for providing so much info!

Also found in #3284 by @agombolay! This is caused by the following code, which intentionally removes duplicate sketches from consideration: https://github.com/sourmash-bio/sourmash/blob/bc2297050b6db10b144916700b4550ec50b26a8f/src/sourmash/search.py#L685-L691 As I wrote in #3284, > ... `search` is...

ref https://github.com/sourmash-bio/sourmash/issues/3022

Will be fixed by https://github.com/sourmash-bio/sourmash_plugin_branchwater/pull/658

hi @zilov apologies for long delay in responding - The size of the full k=21/31/51, scaled=1000 databases is 206 GB. Here's the directory listing: ``` -r--r--r-- 1 ctbrown datalabgrp 48864550095...

Another workflow dealing with the k=31/k=51 issue, for properly calculating `f_unique_weighted`: https://github.com/ctb/2025-sourmash-euk-gtdb-tax