ichorCNA icon indicating copy to clipboard operation
ichorCNA copied to clipboard

Downsampling method of ichorCNA benchmarking

Open yuanzhao0502 opened this issue 6 years ago • 1 comments

I have a question about your benchmarking work. In your paper, you downsampled to the number of reads required to reach exactly 0.01, 0.02, …, 0.09, 0.10 tumor fraction at 0.1× coverage. In the equation, you use the reads number to control how much percentage to downsampled. I am wondering you do the downsampling the bam file by Picard or just use reads number which detected by hmmcopy to multiply the percentage. Because when I use Picard to downsample a bam file, I found it is difficult to control the final reads number very accurate. After that when I use ichorCNA to detect tumor purity, the result is not good. I guess the problem is that the downsampling part is not randomly enough to keep the tumor purity as our expectations. Could you give me some suggestions?

yuanzhao0502 avatar Sep 23 '19 12:09 yuanzhao0502

Hi @yuanzhao0502

I used Picard DownSampleSam. I also made sure to downsample a BAM that has duplicates removed. The readCounter tool in the ichorCNA pipeline ignores duplicates and sampling duplicate reads was an issue.

Hope this helps. Gavin

gavinha avatar Oct 29 '19 16:10 gavinha