Spectre icon indicating copy to clipboard operation
Spectre copied to clipboard

Big difference between results of qdnaseq and spectre

Open EugeneKim76 opened this issue 1 year ago • 3 comments

Dear all, I obtained CNV results using nanopore reads (cov : ~50x, N50 : ~40kb). However, there was big difference between results of qdnaseq and spectre as shown below.

qdnaseq : 4 CNVs spectre : ~300 CNVs

Which result is reliable?

Best

EugeneKim76 avatar Jun 04 '24 07:06 EugeneKim76

Hello @EugeneKim76

This can very well be a normal behavior of Spectre compared to qdnaseq. Could you please be so kind and provide me the command you used for running Spectre? Without it, it is hard to judge if something is off or not.

Spectre, produces coverage plots for every analysis, to get a rough estimate of the location of the DEL/DUPs (Green/Red). They are located at output_dir/img. Even though this is just a rough estimation, the first clue that a CNV has a high probability of being true, is when a green/red bar is located perfectly centered or near a coverage (blue) peak. Please note that the plots are just a rough visualization of the location. (DEL=Green and DUP=RED) and CNVs can overlap in the plot.

Did you also check the VCF entries for their genotype quality (GQ)?

What was the minimum mapping quality used when running Mosdepth for retrieving the coverage (*regions.bed.gz)

Cheers, Philippe

philippesanio avatar Jun 04 '24 10:06 philippesanio

Thanks for prompt reply. I obtained cnv results by spectre used in https://github.com/epi2me-labs/wf-human-variation. For all cases, I used same sample (N50 : ~40kb).

Case1 (cov : ~60x) Options : default options of spectre used in https://github.com/epi2me-labs/wf-human-variation) results : ~300 CNVs

python3 spectre.py CNVCaller
--bin-size 1000
--threshhold-quantile 10
--dist-proportion 0.3
--coverage readstats/
--sample-id sample-name
--output-dir spectre_output/
--reference GRCh38_no_alt_analysis_set.fna
--blacklist black_list_bins_0.02_merged.bed
--min-cnv-len 80000
--snv sample.wf_snp.vcf.gz
--metadata metadata_GRCh38_no_alt.mdr

Case2 (cov : ~60x) Options : default options of spectre results : ~55 CNVs

python3 spectre.py CNVCaller
--bin-size 1000
--threshhold-quantile 5
--dist-proportion 0.25
--coverage readstats/
--sample-id sample-name
--output-dir spectre_output/
--reference GRCh38_no_alt_analysis_set.fna
--blacklist black_list_bins_0.02_merged.bed
--min-cnv-len 100000
--snv sample.wf_snp.vcf.gz
--metadata metadata_GRCh38_no_alt.mdr

Case3 (cov : ~30x) Options : default options of spectre results : ~30 CNVs

python3 spectre.py CNVCaller
--bin-size 1000
--threshhold-quantile 5
--dist-proportion 0.25
--coverage readstats/
--sample-id sample-name
--output-dir spectre_output/
--reference GRCh38_no_alt_analysis_set.fna
--blacklist black_list_bins_0.02_merged.bed
--min-cnv-len 100000
--snv sample.wf_snp.vcf.gz
--metadata metadata_GRCh38_no_alt.mdr

For coverage plots, I obtained typical image (edited) as shown below. It is hard to find green/red bar located perfectly centered or near a coverage peak. image

EugeneKim76 avatar Jun 05 '24 08:06 EugeneKim76

Hi @EugeneKim76,

Thank you for providing more details.

Which version of Spectre are you running? I see a couple of old flags in your command which are not present in Spectre 0.2.1. e.g. the Bin size flag does not exist anymore, or we have switched from using a coverage directory to the coverage file (regions.bed.gz), etc.

Could you be so kind and try the latest version? If you are using a Conda environment, you can simply install the latest release using the package manager Pip pip install spectre-cnv.

This could also be related to the open issue on wf-human-variation https://github.com/epi2me-labs/wf-human-variation/issues/189

Thanks, Philippe

philippesanio avatar Jun 10 '24 13:06 philippesanio