E.L

Results 4 comments of E.L

same applies for me, the model is output logit vector, not softmax

I don't think ECE is differenable bro

But that being siad, NLL is the metric that we should minise in order to make P(Y=y^|y^=f(x)) = f(x) [perfectly calibrated model, you may think the output probs follow a...