hap.py TRUTH.TOTAL differs despite having used the same TRUTH SET

TRUTH.TOTAL differs despite having used the same TRUTH SET

Open robertzeibich opened this issue 3 years ago • 3 comments

Do you know why the TRUTH.TOTAL differs despite having used the same TRUTH SET?

Oct 20 '22 00:10 robertzeibich

Yes, I have that same question.

Apr 03 '23 19:04 shinlin77

Same here. I tried:

Run full GiaB HG002 (AshkenazimTrio/HG002_NA24385_son/NISTv4.2.1/GRCh38) as both the truth and the sample
Run full GiaB HG002 (AshkenazimTrio/HG002_NA24385_son/NISTv4.2.1/GRCh38) as the truth and HG002 subset (1000 variants) as the sample.

I got a different TRUTH.TOTAL counts in the summary. However, the number of annotated variants in hap.py annotated output VCF is the same and is equal to the second test (full sample as the truth and subset as the sample).

However, it seems to apply only for vcfeval engine. With xcmp it seems to report the same numbers.

Mar 13 '24 13:03 janoppelt

bump, same issue.

Aug 12 '24 23:08 zeeev

hap.py hap.py copied to clipboard

TRUTH.TOTAL differs despite having used the same TRUTH SET

hap.py
hap.py copied to clipboard