Metrics Logloss

LogLoss is not defined for p=0 and p=1. Other toolkits clip to [0+eps, 1-eps] to overcome this: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.log_loss.html

Sep 19 '19 22:09 mllg

Thank you for the contribution to the package. I'm torn about what to do here. My primary concern is backwards compatibility. What if some end user has a line of code that is like

if (!is.finite(logLoss(actual, predicted)) {
    predicted <- pmax(eps, pmin(1 - eps, predicted))
}

What do you think about this concern? Other contributors have proposed backwards incompatible changes, and I need to think through the implications.

On the other hand, I question whether the Metrics package should be clipping the user's predictions for them. Perhaps the user should do that themselves? Also, why did you choose 1e-12 rather than 1e-15 as scikit-learn uses or something like .Machine$double.xmin?

Sep 20 '19 04:09 mfrasco

What do you think about this concern? Other contributors have proposed backwards incompatible changes, and I need to think through the implications.

Well, that's a difficult question. One could introduce additional arguments (clip = FALSE) to stick to the old behavior? The same could be done for undefined values for precision/recall in #36 (missing.val = NA vs. missing.val = 0). Note that you might end up with a package with really inconvenient defaults ...

On the other hand, I question whether the Metrics package should be clipping the user's predictions for them. Perhaps the user should do that themselves? Also, why did you choose 1e-12 rather than 1e-15 as scikit-learn uses or something like .Machine$double.xmin?

It was late, 1e-15 should also work. AFAIK the more generic approach would be to use sqrt(.Machine$double.eps) (c.f. ?all.equal).

Sep 20 '19 06:09 mllg