hap.py icon indicating copy to clipboard operation
hap.py copied to clipboard

TRUTH.TOTAL differs despite having used the same TRUTH SET

Open robertzeibich opened this issue 3 years ago • 3 comments

Do you know why the TRUTH.TOTAL differs despite having used the same TRUTH SET?

hap py output

robertzeibich avatar Oct 20 '22 00:10 robertzeibich

Yes, I have that same question.

shinlin77 avatar Apr 03 '23 19:04 shinlin77

Same here. I tried:

  1. Run full GiaB HG002 (AshkenazimTrio/HG002_NA24385_son/NISTv4.2.1/GRCh38) as both the truth and the sample
  2. Run full GiaB HG002 (AshkenazimTrio/HG002_NA24385_son/NISTv4.2.1/GRCh38) as the truth and HG002 subset (1000 variants) as the sample.

I got a different TRUTH.TOTAL counts in the summary. However, the number of annotated variants in hap.py annotated output VCF is the same and is equal to the second test (full sample as the truth and subset as the sample).

However, it seems to apply only for vcfeval engine. With xcmp it seems to report the same numbers.

janoppelt avatar Mar 13 '24 13:03 janoppelt

bump, same issue.

zeeev avatar Aug 12 '24 23:08 zeeev