kevlar
kevlar copied to clipboard
Re-evaluate impact of error correction
Performing error correction drastically reduces the sequence content (specifically the number of distinct k-mers) in each data set, and accordingly the amount of memory required to track k-mer counts accurately. At one point we were pretty enthusiastic about this improvement, but abandoned it at one point since it led to some false negatives.
I think this decision was based on a small number of manually inspected variants (perhaps even 1), and not on overall statistics. And in any case all of the variants involved were SNVs, where our superiority is already marginal. We should re-investigate kevlar's performance on the latest simulations using error corrected data.