David Holzmüller comments

Results 28 comments of


                                            David Holzmüller

Limit Cramer's V analysis to target column

> I wonder if such functionality should be in the TableReport, or rather in the column_association function, but not exposed in the TableReport. It seems a bit specific and low...

Limit Cramer's V analysis to target column

In my case, there were ~150 features, so it would at least be good if the table was sorted. Although if I'm in the position of a casual user that...

Limit Cramer's V analysis to target column

it might, assuming that I'm reading this section :)

Implement temperature scaling for (multi-class) calibration

Almost. It would be $\left[ y_1, \cdots, y_n \right] \mapsto \mathrm{softmax}\left( \left[ \frac{\log(y_1)}{T}, \cdots, \frac{\log(y_n)}{T} \right] \right)$ if the $y_i$ are probabilities (actually, they don't need to be normalized for...

Implement temperature scaling for (multi-class) calibration

Post-hoc calibration is essentially about learning a classifier whose inputs are the model outputs that should be calibrated. In this sense, post-hoc calibration of multiclass classification is an unsolved problem...

Implement temperature scaling for (multi-class) calibration

> Not the hole story. Post-hoc classification estimates P(Y|m(X)) with model m, while the original goal is to estimate P(Y|X). Note the difference in conditioning! For binary classification, m(X) is...

Implement temperature scaling for (multi-class) calibration

@virchan 1. has been evaluated at least in the original paper I mentioned, and probably in many others as well. 2. The "matrix scaling" method mostly performed much worse than...

Implement temperature scaling for (multi-class) calibration

Thank you! I don't have time to check the code in detail right now, but in the testing, you apply temperature scaling to probabilities instead of logits (e.g. log-probabilities). Also,...

Implement temperature scaling for (multi-class) calibration

I don't know if there is a specific reference for the optimal temperature being >= 1, this is my intuition for the following reason: The original temperature scaling paper shows...

Implement temperature scaling for (multi-class) calibration

> > it seems empirically this brings value on non-NN algorithms as well, if I read this thread correctly. > > Could you please point me to it because I...