hap.py
hap.py copied to clipboard
Calculation of METRIC.Precision
Dear hap.py development team,
Could you please tell me, how the calculation for METRIC.Precision is done for the hap.py statistics summary?
According to the docs, the following formula is used:
Precision = TP/(TP+FP)
So I tried to calculate the Precision with TRUTH.TP/(TRUTH.TP+QUERY.FP)
, however I get different results compared to the METRIC.Precision
. Is this behaviour expected? And if yes, could you please provide me with the formula (or correct column names) to get METRIC.Precision
correctly?
See below the examples from your git repo. The behaviour is observed for
vcfeval...
Type Filter TRUTH.TOTAL TRUTH.TP TRUTH.FN QUERY.TOTAL QUERY.FP METRIC.Recall **METRIC.Precision** METRIC.F1_Score **TRUTH.TP/(TRUTH.TP+QUERY.FP)**
0 INDEL ALL 8929 7968 961 11812 227 0.892373 **0.972637** 0.930778 **0.9723001830384381**
1 INDEL PASS 8929 7660 1269 9971 175 0.857879 **0.978155** 0.914077 **0.9776643267389917**
2 SNP ALL 52494 52174 320 90092 504 0.993904 **0.990444** 0.992171 **0.9904324385891644**
3 SNP PASS 52494 46955 5539 48078 90 0.894483 **0.998089** 0.94345 **0.9980869380380487**
... and happy
Type Filter TRUTH.TOTAL TRUTH.TP TRUTH.FN QUERY.TOTAL QUERY.FP METRIC.Recall **METRIC.Precision** METRIC.F1_Score **TRUTH.TP/(TRUTH.TP+QUERY.FP)**
0 INDEL ALL 8937 7839 1098 11812 343 0.87714 **0.958635** 0.916079 **0.9580787093620142**
1 INDEL PASS 8937 7550 1387 9971 283 0.844803 **0.964656** 0.90076 **0.9638708030128942**
2 SNP ALL 52494 52125 369 90092 582 0.992971 **0.988966** 0.990964 **0.9889578234390118**
3 SNP PASS 52494 46920 5574 48078 143 0.893816 **0.996963** 0.942576 **0.9969615196651297**
... but not for unhappy
Type Filter TRUTH.TOTAL TRUTH.TP TRUTH.FN QUERY.TOTAL QUERY.FP METRIC.Recall **METRIC.Precision** METRIC.F1_Score **TRUTH.TP/(TRUTH.TP+QUERY.FP)**
0 INDEL ALL 8937 7060 1877 11812 1232 0.789974 **0.851423** 0.819548 **0.8514230583695128**
1 INDEL PASS 8937 6850 2087 9971 1157 0.766476 **0.855501** 0.808546 **0.8555014362432871**
2 SNP ALL 52494 52105 389 90092 639 0.99259 **0.987885** 0.990232 **0.9878848779008039**
3 SNP PASS 52494 46908 5586 48078 178 0.893588 **0.99622** 0.942117 **0.9962196831329907**
Thank you very much for your time and I'm looking forward to hearing from you soon.
Best regards Barbara
I am seeing the same thing. I tried playing with the numbers but it just doesn't add up...