cnvkit icon indicating copy to clipboard operation
cnvkit copied to clipboard

how to accurately select samples from cnvkit.py metrics *.cnr -s *.cns

Open worker000000 opened this issue 4 years ago • 3 comments

Thanks a lot I have many normal samples, and I use cnvkit.py metrics *.cnr -s *.cns to find the noisy sample, the cnvkit docs https://cnvkit.readthedocs.io/en/stable/reports.html#metrics said several markers to do, but is not that easy to do

some smaples with red arrow may should deleted because high segments, is there any other advice?

image

worker000000 avatar Jul 21 '21 01:07 worker000000

@etal @tetedange13 any comment on this, thanks a lot

worker000000 avatar Oct 10 '21 10:10 worker000000

It can be helpful to plot each of these columns and look for outliers visually. In your case I'd recommend opening this table in a spreadsheet, sort by each of the columns individually, and look for any extreme values either numerically or by plotting. The average coverage depth of each sample, or number of reads in the BAM, is also a useful heuristic. If the same samples are being used for other 'omics analysis, it can be wise to use consistent sample acceptance criteria across analyses.

etal avatar Oct 11 '21 21:10 etal

@etal thanks a lot, we often keep value in [mean - 3 * sigma, mean + 3 * sigma], can this also be applied to cnvkit filter samples in command metrics?

worker000000 avatar Oct 25 '21 06:10 worker000000