tf-keras Scoring for metrics and losses,

Scoring for metrics and losses,

Open EngrStudent opened this issue 3 years ago • 8 comments

Disclaimer: I'm an engineer by training, not a statistician, so I can get this wrong. I don't know that the general thrust has technical gaps, but I could be off in some of the details. If I am, please go talk to your friendly neighborhood stats professor and get their thoughts on the idea.

Background:
I hang out (and interact, and have for years, and learn tons) at CrossValidated, a stack-exchange site.

One of the very substantial families of threads there is why accuracy is not ideal in many places, and many of the folks engaged in the discussions are fantastic PhD's, in academia and industry, teaching or working for decades, so they are a very important source of technical wisdom.

Here are some of the threads there:

https://stats.stackexchange.com/questions/359909/is-accuracy-an-improper-scoring-rule-in-a-binary-classification-setting
https://stats.stackexchange.com/questions/357466/are-unbalanced-datasets-problematic-and-how-does-oversampling-purport-to-he
https://stats.stackexchange.com/questions/312780/why-is-accuracy-not-the-best-measure-for-assessing-classification-models
https://stats.stackexchange.com/questions/359909/is-accuracy-an-improper-scoring-rule-in-a-binary-classification-setting/359936#359936
https://stats.stackexchange.com/questions/368949/example-when-using-accuracy-as-an-outcome-measure-will-lead-to-a-wrong-conclusio

They like these things called "strictly proper score functions" or "strictly proper scoring rules".

Here are references on strictly proper scoring rules:

https://apps.dtic.mil/sti/pdfs/ADA459827.pdf
(of course) https://en.wikipedia.org/wiki/Scoring_rule#Proper_scoring_rules
(other folks in github have engaged them) https://github.com/mlr-org/mlr/issues/880
https://sites.stat.washington.edu/raftery/Research/PDF/Gneiting2007jasa.pdf
https://www.tensorflow.org/probability/api_docs/python/tfp/stats/brier_score
https://xianblog.wordpress.com/2017/11/21/the-hyvarinen-score-is-back/
https://faculty.missouri.edu/~merklee/pub/MerkleSteyvers2013.pdf

When I got to keras loss and metrics pages I don't see those scoring rules explicitly, and I think its a miss. I think some may be in there, but I must have missed them.

Current losses from documentation:

binary/categorical cross-entropy (I think this is related to logloss)
KL divergence
Poisson class

Current metrics from documentation:

Accuracy
Binary/Categorical/TopK accuracy
Binary/categorical crossentropy
AUC/and the TF PN measures

Recommendation/Suggestion:
I think you should add the following "strictly proper scoring rules" to Keras because it can make it easier for new users (and their pointy-haired bosses) to use technically exemplary approaches in some of their problem solving.

Some rules to consider:

Brier/quadratic scoring rule
Hyvarinen scoring rule
Spherical Scoring Rule
Logarithmic scoring rule (log-probability)

Dec 13 '21 03:12 EngrStudent

tf-keras tf-keras copied to clipboard

Scoring for metrics and losses,

tf-keras
tf-keras copied to clipboard