PureCN
PureCN copied to clipboard
A high duplication rate for WGS
Hi I can WGS on cell lines. When I use wgs_calling_regions.hg38.bed as an input for making interval file and when I set off-target regions to 130000 for coverage calculation, I get things as below; Do you think I should be worry about duplication rate?
- mean.coverage.ontarget: This value represents the average coverage of the target regions. In this case, it is approximately 8.77, meaning on average, each base in the target regions was sequenced 8.77 times.
- mean.coverage.offtarget: This value represents the average coverage of the off-target regions, which is about 0.16. This is significantly lower than the on-target coverage, as expected in targeted sequencing where the focus is primarily on specific regions of interest.
3. mean.duplication.ontarget: This indicates the mean duplication rate of the on-target regions, approximately 0.99. A high duplication rate close to 1.0 suggests that almost all on-target reads are duplicates, which can happen if the library complexity is low or if there is an over-amplification during the library preparation.
- mean.duplication.offtarget: Similar to the on-target duplication rate, this value represents the mean duplication rate of the off-target regions, also approximately 0.99.
- mom.raw.ontarget: This stands for “median of means” of the raw coverage data for the on-target regions. The value 0.991334418167518 suggests high uniformity in coverage across the target regions before any normalization.
- mom.raw.offtarget: This is the “median of means” of the raw coverage data for the off-target regions. It is similar to the on-target raw data value, indicating consistency across off-target areas.
- mom.post.gc.ontarget: This is the “median of means” after GC bias correction for on-target regions. The correction attempts to account for GC-content bias in sequencing. The value 0.991334418167518 suggests minimal deviation from uniformity post-GC correction.
- mom.post.gc.offtarget: This value, similar to the on-target GC correction, applies to the off-target regions. The value shows how the coverage looks after adjusting for GC content across non-targeted regions.
- mom.post.reptiming.ontarget: After considering replication timing (the phase of DNA replication when particular sequences are duplicated), this metric shows the coverage median of means for the on-target areas, remaining unchanged in your case.
- mom.post.reptiming.offtarget: Like the on-target replication timing metric, this value is for off-target regions. It remains unchanged, indicating consistent coverage and bias corrections.
Thanks for any idea