Bismark icon indicating copy to clipboard operation
Bismark copied to clipboard

Bismark silently outputs incorrect results when UMIs are added using Illuminas bcl-convert

Open lars-work-sund opened this issue 5 months ago • 7 comments

This is not really a bug as the documentation clearly states how deduplicate_bismark expects UMIs to be handled, but it is an easy mistake to make. As documented in deduplicate_bismark, Bismark expects UMIs of the form: @A00001:001:HN2F7DRX1:1:1101:1452:1000 1:N:0:AATGACGC:CAAGAG But if Illuminas bcl-convert is used with OverrideCycles to handle UMIs, the read ID looks like this @A00001:001:HN2F7DRX1:1:1101:1452:1000:CAAGAG 1:N:0:AATGACGC The UMI is highlighted in bold. This means the sample index is used as a UMI, and no warning or error is emitted.

I propose running a pre-flight check to detect this scenario, and potentially to support the UMI location chosen by Illumina.

EDIT: I might have been completely off. I'll close it for now.

lars-work-sund avatar Sep 11 '24 08:09 lars-work-sund