facets icon indicating copy to clipboard operation
facets copied to clipboard

Get the log2R for each segment from facet result

Open ysbioinfo opened this issue 6 years ago • 7 comments

Hi, I am using FACETS to estimate the ASCNV for my WES data. I can get the absolute copy number for each segment now, but I want to call recurrent CNV by GISTIC then, which required a log2 ratio for each segment. Could you tell me how to get the logR information from facets result? Thanks! Below is my code for running FACETs. library(facets) input_dir <- '/data1/XiLab/shiyang/data/ICC_103/ICC_103_snp_pileup_1/' output_dir <- '/data1/XiLab/shiyang/data/ICC_103/ICC_103_facets_result_1/' files <- dir(input_dir) for (file in files){ path <- paste(input_dir, file, sep = '') case <- strsplit(file, '\\.')[[1]][1] pre_data <- preProcSample(file = path) data <- procSample(pre_data, cval = 150) fit_data <- emcncf(data) pur <- fit_data$purity plo <- fit_data$ploidy tmp_df <- data.frame(sample = case, purity = pur, ploidy = plo) seg_df <- data.frame(fit_data$start, fit_data$end, fit_data$cncf) write.table(seg_df, paste(output_dir, case, '.seg_info.txt', sep = ''), sep = '\t', row.names = FALSE) }

ysbioinfo avatar Jul 10 '18 10:07 ysbioinfo

The cnlr.median for each segment is the relevant segment log-ratio value.

veseshan avatar Jul 10 '18 20:07 veseshan

Thank you!

ysbioinfo avatar Jul 11 '18 15:07 ysbioinfo

Hi Venkat,

adding on this, can we really use these "raw" cnlr.median values ? For example, in order to see the concordance of CN between 2 matched samples. Indeed, in many samples, the dipLogR value is not at zero. Shouldn't we thus correct each cnlr.median for diplogr in order to have comparable data ?

Thanks for your valuable input! Best,

Cedric

cedricvanm avatar Oct 12 '18 08:10 cedricvanm

Hi Cedric,

cnlr.median is the segment level summary of observed data. Correcting it for dipLogR is a one line code where as once corrected it no longer is the summary of the observed data. Also corrected values don't give you comparable values for two samples since the sample purity also has to be accounted for. It is best to compare the estimated copy numbers.

Thanks, Venkat

veseshan avatar Oct 12 '18 15:10 veseshan

Hello,

I wanted to generate segmentation files for GISTIC as well. I am seeing that there are conflicting answers to this https://github.com/mskcc/facets/issues/84#issuecomment-392079533. Please correct me if I am wrong. Do we need to subtract dipLogR from cnlr.median for GISTIC analysis across a cohort?

Thanks.

ahwanpandey avatar Feb 06 '19 01:02 ahwanpandey

I don't see a conflict. Please use cnlr.median - dipLogR

veseshan avatar Feb 06 '19 14:02 veseshan

In the #84 (comment). comment, "(cnlr.median - dipLogR) is the log(total copy number) unadjusted for purity which is what you want", "the log tcn values, which corresponds to also adjusting for the purity", and the above comment "cnlr.median is the segment level summary of observed data". Did you mean:

  1. cnlr.median is raw segment level which has not been adjusted for purity and ploidy ?
  2. cnlr.median - dipLogR is only adjusted for ploidy ?
  3. log tcn values is adjusted for both purity and ploidy ? Sorry, I have been confused with these values. Please correct me if I have made a mistake. Then, which value should be used for GISTIC 2.0 ?

Thank you very much.

qindan2008 avatar Aug 11 '21 02:08 qindan2008