strobealign icon indicating copy to clipboard operation
strobealign copied to clipboard

Lower bounding w_min to 1 instead of 0?

Open ksahlin opened this issue 2 months ago • 0 comments

For logging and coming back to later:

Currently, we have randstrobe(l, u, q, max_dist, std::max(0, k / (k - s + 1) + l), k / (k - s + 1) + u).

This is is primarily for our parameter optimization, but if parameters is chosen such that k / (k - s + 1) + l <= 0, then w_min is set to 0. Because of our link function std::bitset<64> b = (strobe1.hash ^ syncmers[i].hash) & q;, w_min= 0 will deterministically pick strobe1=strobe2 and thus effectively emulate k-mers. For such a setting and some read lengths, I observed a significant drop in map rate (over 5%), increased runtime, and often but not always a decrease in accuracy. This may have been the problem with our initial parameter optimization?

With mcs, picking k-mers (w_min=0) should be strictly worse than any other parameters, since we get the k-mers for free.

ksahlin avatar Apr 18 '24 14:04 ksahlin