CNVKit 0.9.9 Batch mode has inconsistent CN and log2Ratio
Hello,
I have a pair of WGS sample (A normal and a tumor). And I use the command line to detect CNVs.
cnvkit.py batch $tumor --normal $normal
--fasta $reference
--annotate $refFlat
--output-dir $out/$sample
-p 20 -m wgs
--scatter --diagram
But the log2Ratio and the copy number in my sample.call.cns file is inconsistent. I list the top 10 lines below: log2 cn 0.224267 3 -0.207029 2 0.0982319 2 -0.222901 2 0.140716 2 -0.117483 2 0.0339522 2 -0.0921905 2 0.157111 2
Could you tell me the reason? Or what's the defination of log2Ratio and cn in this file ? Thank you!
Hi @JD12138,
The steps leading to sample.call.cns file are not vey well documented :
- Raw log2ratios (the ones contained in your
sample.cnsfile) are filtered to remove likely false-positive segments (based on confidence interval calculation) - Segments are median-centered and the "p_ttest" column is calculated (p-value of a t-test)
- Eventually integer copy number values are called from log2ratio values (= "cn" column), using thresholds method with following thresholds : -1.1, -0.25, 0.2, 0.7
To sum up your observed CN and log2ratio values are not inconsistent
=> Everytime a log2ratio falls between -0.25 and 0.2, called CN is "neutral" = reference ploidy = 2 (by default)
=> Your first row with log2ratio=0.224267 is above 0.2 threshold so CN=3 is called
Hope this helped ! Have a nice day, Felix.
Hi, I recently updated CNVkit (i used 0.9.5 for a looong time) and noticed that batch produces also call.cns files with p_ttest column and without ci_hi / ci_lo columns. However, when I manually calculate call.cns files, the latter are preserved while p_ttest is not calculated. I wanted to ask whether pval can be calculated here using some bulit-in command of function or only manually from confidence intervals?