Logloss
LogLoss is not defined for p=0 and p=1. Other toolkits clip to [0+eps, 1-eps] to overcome this: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.log_loss.html
Thank you for the contribution to the package. I'm torn about what to do here. My primary concern is backwards compatibility. What if some end user has a line of code that is like
if (!is.finite(logLoss(actual, predicted)) {
predicted <- pmax(eps, pmin(1 - eps, predicted))
}
What do you think about this concern? Other contributors have proposed backwards incompatible changes, and I need to think through the implications.
On the other hand, I question whether the Metrics package should be clipping the user's predictions for them. Perhaps the user should do that themselves? Also, why did you choose 1e-12 rather than 1e-15 as scikit-learn uses or something like .Machine$double.xmin?
What do you think about this concern? Other contributors have proposed backwards incompatible changes, and I need to think through the implications.
Well, that's a difficult question. One could introduce additional arguments (clip = FALSE) to stick to the old behavior? The same could be done for undefined values for precision/recall in #36 (missing.val = NA vs. missing.val = 0). Note that you might end up with a package with really inconvenient defaults ...
On the other hand, I question whether the Metrics package should be clipping the user's predictions for them. Perhaps the user should do that themselves? Also, why did you choose
1e-12rather than1e-15as scikit-learn uses or something like.Machine$double.xmin?
It was late, 1e-15 should also work. AFAIK the more generic approach would be to use sqrt(.Machine$double.eps) (c.f. ?all.equal).