David Holzmüller
David Holzmüller
> I wonder if such functionality should be in the TableReport, or rather in the column_association function, but not exposed in the TableReport. It seems a bit specific and low...
In my case, there were ~150 features, so it would at least be good if the table was sorted. Although if I'm in the position of a casual user that...
it might, assuming that I'm reading this section :)
Almost. It would be $\left[ y_1, \cdots, y_n \right] \mapsto \mathrm{softmax}\left( \left[ \frac{\log(y_1)}{T}, \cdots, \frac{\log(y_n)}{T} \right] \right)$ if the $y_i$ are probabilities (actually, they don't need to be normalized for...
Post-hoc calibration is essentially about learning a classifier whose inputs are the model outputs that should be calibrated. In this sense, post-hoc calibration of multiclass classification is an unsolved problem...
> Not the hole story. Post-hoc classification estimates P(Y|m(X)) with model m, while the original goal is to estimate P(Y|X). Note the difference in conditioning! For binary classification, m(X) is...
@virchan 1. has been evaluated at least in the original paper I mentioned, and probably in many others as well. 2. The "matrix scaling" method mostly performed much worse than...
Thank you! I don't have time to check the code in detail right now, but in the testing, you apply temperature scaling to probabilities instead of logits (e.g. log-probabilities). Also,...
I don't know if there is a specific reference for the optimal temperature being >= 1, this is my intuition for the following reason: The original temperature scaling paper shows...
> > it seems empirically this brings value on non-NN algorithms as well, if I read this thread correctly. > > Could you please point me to it because I...