schmutzi icon indicating copy to clipboard operation
schmutzi copied to clipboard

Schmutzi damage estimation for NEBNext® Ultra™ II DNA library kit

Open freshspaceoctopus opened this issue 1 year ago • 1 comments

Hello, I am currently planning estimating contamination for non-UDG NEBNext® Ultra™ II DNA library kits.

As you see, non-UDG NEBNext® Ultra™ II DNA library kits have different damage patterns: They are removed of their 5' C->T deamination, resulting into a different damage plot.

So, there is a problem in estimating contamination priors. The program would fail to find endogenous contamination estimates.

I used a workaround to calculate entire deamination probabilities using your bam2prof without the endo option, then providing this file as a prior.

When we did naive simulation, this approach often underestimated contamination ( if there was 20% contamination, contamination estimates were about 10%)

My understanding was giving the program a sensible prior, the program will work fine, but maybe there needs to be more work done to actually apply schmutzi for non-UDG NEBNext® Ultra™ II DNA data.

Do you have any suggestions for contamination analysis?

Thank you in advance.

freshspaceoctopus avatar Oct 15 '24 12:10 freshspaceoctopus

thank you for the question! There is a possibility of bypassing the initial estimate and supply schmutzi with an estimate. However, be careful, then there is no way of teasing apart the contaminant and endogenous.

grenaud avatar Oct 15 '24 13:10 grenaud