hap.py icon indicating copy to clipboard operation
hap.py copied to clipboard

hap.py comparison with query SNP.vcf

Open junyanzho opened this issue 5 years ago • 1 comments

Dear developer, I compared trueset file NA12877.vcf.gz( including SNP and IDNEL variants) with query file(only SNP variants). For the INDEL in summary.csv, there are numbers on TRUTH.TP TRUTH.FN columns, while QUERY is zero. summary table as below:

Type Filter TRUTH.TOTAL TRUTH.TP TRUTH.FN QUERY.TOTAL QUERY.FP QUERY.UNK FP.gt
INDEL ALL 523711 2648 521063 0 0 0 0
INDEL PASS 523711 2646 521065 0 0 0 0
SNP ALL 3519056 3484008 35048 3815273 3372 326544 2394
SNP PASS 3519056 3479369 39687 3781822 3005 298100 2199

junyanzho avatar Mar 30 '20 09:03 junyanzho

I guess your query set has no indels in it, which is why those columns are empty. As for why the INDEL TRUTH.TP is greater than zero even though your query set does not contain indels, see https://cdn.rawgit.com/RealTimeGenomics/rtg-core/master/installer/resources/core/RTGOperationsManual/rtg_command_reference.html#benchmarking-performance-for-snps-versus-indels

Lenbok avatar Mar 30 '20 19:03 Lenbok