hap.py icon indicating copy to clipboard operation
hap.py copied to clipboard

Calculation of METRIC.Precision

Open b-math opened this issue 2 years ago • 1 comments

Dear hap.py development team,

Could you please tell me, how the calculation for METRIC.Precision is done for the hap.py statistics summary?

According to the docs, the following formula is used: Precision = TP/(TP+FP)

So I tried to calculate the Precision with TRUTH.TP/(TRUTH.TP+QUERY.FP), however I get different results compared to the METRIC.Precision. Is this behaviour expected? And if yes, could you please provide me with the formula (or correct column names) to get METRIC.Precision correctly?

See below the examples from your git repo. The behaviour is observed for

vcfeval...

	Type	Filter	TRUTH.TOTAL	TRUTH.TP	TRUTH.FN	QUERY.TOTAL	QUERY.FP	METRIC.Recall	**METRIC.Precision**	METRIC.F1_Score	**TRUTH.TP/(TRUTH.TP+QUERY.FP)**
0	INDEL	ALL	8929	7968	961	11812	227	0.892373	**0.972637**	0.930778	**0.9723001830384381**
1	INDEL	PASS	8929	7660	1269	9971	175	0.857879	**0.978155**	0.914077	**0.9776643267389917**
2	SNP	ALL	52494	52174	320	90092	504	0.993904	**0.990444**	0.992171	**0.9904324385891644**
3	SNP	PASS	52494	46955	5539	48078	90	0.894483	**0.998089**	0.94345	**0.9980869380380487**

... and happy

	Type	Filter	TRUTH.TOTAL	TRUTH.TP	TRUTH.FN	QUERY.TOTAL	QUERY.FP	METRIC.Recall	**METRIC.Precision**	METRIC.F1_Score	**TRUTH.TP/(TRUTH.TP+QUERY.FP)**
0	INDEL	ALL	8937	7839	1098	11812	343	0.87714	**0.958635**	0.916079	**0.9580787093620142**
1	INDEL	PASS	8937	7550	1387	9971	283	0.844803	**0.964656**	0.90076	**0.9638708030128942**
2	SNP	ALL	52494	52125	369	90092	582	0.992971	**0.988966**	0.990964	**0.9889578234390118**
3	SNP	PASS	52494	46920	5574	48078	143	0.893816	**0.996963**	0.942576	**0.9969615196651297**

... but not for unhappy

	Type	Filter	TRUTH.TOTAL	TRUTH.TP	TRUTH.FN	QUERY.TOTAL	QUERY.FP	METRIC.Recall	**METRIC.Precision**	METRIC.F1_Score	**TRUTH.TP/(TRUTH.TP+QUERY.FP)**
0	INDEL	ALL	8937	7060	1877	11812	1232	0.789974	**0.851423**	0.819548	**0.8514230583695128**
1	INDEL	PASS	8937	6850	2087	9971	1157	0.766476	**0.855501**	0.808546	**0.8555014362432871**
2	SNP	ALL	52494	52105	389	90092	639	0.99259	**0.987885**	0.990232	**0.9878848779008039**
3	SNP	PASS	52494	46908	5586	48078	178	0.893588	**0.99622**	0.942117	**0.9962196831329907**

Thank you very much for your time and I'm looking forward to hearing from you soon.

Best regards Barbara

b-math avatar May 18 '22 14:05 b-math

I am seeing the same thing. I tried playing with the numbers but it just doesn't add up...

skDooley avatar Aug 29 '23 20:08 skDooley