mixcr
mixcr copied to clipboard
PCR and Sequencing error correction improvement
- Match quality and probability for more rigorous filtering of sequencing errors:
a. Use
MaxQuality
strategy for overlap merger b. UseSumQuality
for clone accumulators c. Filter out clones with big error probability (p = (1 - Prod{1 - 10^(-q_i/10)} )) - Strategy for more rigorous PCR error correction:
a. Introduce lower bound (
lowerBound
) for initial molecule count (approx. 10 * (total reads count) / (minimal clone count) ) b. Estimate error rate for each type of substitution (e.g. A -> T) by counting all variants in each cluster and averaging over all clusters with possible outlier detection c. Cluster minor clones that satisfies estimated error rates within e.g. 3sigma and (d.) use(total reads count) / lowerBound
as absolute minimal concentration of minor clone for clustering - Test all these things against real data