sickle icon indicating copy to clipboard operation
sickle copied to clipboard

Fix off-by-one error finding 3' cutoff.

Open MikkelSchubert opened this issue 7 years ago • 0 comments

This patch fixes an off-by-one error in the 'sliding_window' function. Briefly, the function checks if window_start + window_size > fqrec->qual.l, to determine if the current window is the last window, and if it should therefore refine the 3' cutoff. However, the loop is constrained to i + window_size <= fqrec->qual.l, where i == window_start, so that check never succeeds. The result is that trailing low quality bases are not correctly trimmed if the average quality does not fall below the minimum before the last window.

The problem can be demonstrated with the following FASTQ read:

$ cat example.fq
@foo
AAAAAAAAAAAAAAAAAAAAAAAAAAAAA
+
IIIIIIIIIIIIIIIIIIIIIIIIIIII!

This is the output when processed with the current head (d802a80f89c02c93d112151bc8426d029ef16f7e):

$ ./sickle_old se -f example.fq -t sanger -o /dev/stdout
@foo
AAAAAAAAAAAAAAAAAAAAAAAAAAAAA
+
IIIIIIIIIIIIIIIIIIIIIIIIIIII!
...

With the attached patch, the final refinement is carried out at the last window:

$ ./sickle_new se -f example.fq -t sanger -o /dev/stdout
@foo
AAAAAAAAAAAAAAAAAAAAAAAAAAAA
+
IIIIIIIIIIIIIIIIIIIIIIIIIIII
...

MikkelSchubert avatar May 08 '17 19:05 MikkelSchubert