cnvkit how to accurately select samples from cnvkit.py metrics *.cnr -s *.cns

Thanks a lot I have many normal samples, and I use cnvkit.py metrics *.cnr -s *.cns to find the noisy sample, the cnvkit docs https://cnvkit.readthedocs.io/en/stable/reports.html#metrics said several markers to do, but is not that easy to do

some smaples with red arrow may should deleted because high segments, is there any other advice?

Jul 21 '21 01:07 worker000000

@etal @tetedange13 any comment on this, thanks a lot

Oct 10 '21 10:10 worker000000

It can be helpful to plot each of these columns and look for outliers visually. In your case I'd recommend opening this table in a spreadsheet, sort by each of the columns individually, and look for any extreme values either numerically or by plotting. The average coverage depth of each sample, or number of reads in the BAM, is also a useful heuristic. If the same samples are being used for other 'omics analysis, it can be wise to use consistent sample acceptance criteria across analyses.

Oct 11 '21 21:10 etal

@etal thanks a lot, we often keep value in [mean - 3 * sigma, mean + 3 * sigma], can this also be applied to cnvkit filter samples in command metrics?

Oct 25 '21 06:10 worker000000