PureCN icon indicating copy to clipboard operation
PureCN copied to clipboard

I need advice when there is no normal

Open MaryGoAround opened this issue 1 year ago • 3 comments

I have targeted DNA sequencing for a tumour and I do not have any normal samples (matched or even non matched)

I am trying this but I get error

 Rscript /R/x86_64-pc-linux-gnu-library/4.4/PureCN/extdata/PureCN.R --out purecn_output --tumor sample_coverage_loess.txt.gz --vcf tumor.vcf.gz --sampleid sample --genome hg38 --post-optimize --force --fun-segmentation Hclust
[1] "folder"
INFO [2024-08-22 16:33:26] Loading PureCN 2.10.0...
Error in .getNormalCoverage(normal.coverage.file) : 
  Need either normalDB or normal.coverage.file
Execution halted

Please give me an advice how to use this software in the case

Thanks a lot

sample_coverage_loess

MaryGoAround avatar Aug 22 '24 15:08 MaryGoAround

Hi @MaryGoAround , unfortunately, the software assumes normal samples. We don't use WGS, and for hybrid capture data, it is really difficult to normalize cancer data without normal samples. At least to get it as clean as needed for purity/ploidy determination.

I would check WGS specific software like https://github.com/Wedge-lab/battenberg . It's been a long time since I last used it, so there might be other tools more appropriate for your data.

Sorry that I cannot be more helpful.

lima1 avatar Aug 26 '24 19:08 lima1

Hi, Do you have any contact or email address that we can discuss regarding your required advice since this my question too?

El84Ja avatar Oct 07 '24 03:10 El84Ja

Hi

Finally I could got a plot for my tumour (cell line) using this command

for BAM in $TUMOR_BAMS; do

SAMPLE_ID=$(basename $BAM .bam)

VCF_PATH="tumor.vcf.gz"

Rscript $PURECN/PureCN.R --out output \

    --tumor coverage_loess.txt.gz \

    --sampleid $SAMPLE_ID --vcf $VCF_PATH

--normaldb normalDB_wgs_hg38.rds \

    --intervals baits_hg38_intervals.txt --genome hg38 \

    --mapping-bias-file mapping_bias_wgs_hg38.rds \

    --fun-segmentation PSCBS --force --post-optimize --seed 123

--max-copy-number 8 --min-purity 0.9 --max-purity 0.99 \

    --model-homozygous TRUE

Please correct me if I interpret the plot correctly

  • Gray dots: Represent individual copy number bins for segments across the genome. These points reflect raw copy number values.

  • Purple and Red Lines: Indicate the segmented copy number. The two lines represent the major allele (C1) and minor allele (C2).

    • Purple Line: Major allele (C1)
    • Red Line: Minor allele (C2)

Thanks a lot for any help

MaryGoAround avatar Oct 10 '24 14:10 MaryGoAround