pbwt
pbwt copied to clipboard
Bug in algorithm ReportLongMatches?
The algorithm 3 (pbwt -longWithin 5
) does not report some long matches. For example, in the following example:
1:1010001 001100
4:1010001 001100
0:0110000 101010
2:0011000 110010
5:0011001 000010
3:1011001 100100
I would think that a match between the haplotypes 5 and 3 should be reported at k=6 or is my interpretation of a "long match" incorrect?
EDIT: or perhaps 5 and 2 for k=6 and 5 and 3 for k=7, depending if we report at the last matching position or immediately after.
I have observed a similar issue where it appears that some matches are not being reported. It seems that the algorithm matchLongWithin2 is not capturing all of the matches. Is there an alternative implementation that addresses this issue?