Laplace Clarify how the softmax is handled for classification

Clarify how the softmax is handled for classification

Open edaxberger opened this issue 3 years ago • 2 comments

Currently it's not really clear how the final softmax is dealt with in the classification case, which might lead to confusion / unintentional misuse of the library.

There's two things to clarify:

That the MAP model put into Laplace shouldn't apply a softmax (either via a nn.Sofmax() layer in the model or a F.softmax() call in the overwritten forward pass) but return the logits instead. This could probably most easily be fixed by clarifying it in the documentation/readme and additionally raising a warning if the model outputs on the training set during fit() lie in [0,1] and sum to 1.
That the Laplace model applies the softmax internally when making predictions and that, therefore, the user shouldn't apply another softmax on top. Here we can probably only improve the documentation.

Jul 15 '21 07:07 edaxberger

Currently it raises an Exception like Extension saving to kflr does not have an extension for Module <class 'torch.nn.modules.activation.Softmax'> :(

Feb 22 '22 05:02 youkaichao

@youkaichao Thanks for letting us know -- this exception comes from the BackPACK backend we're using to compute second-order information; not sure if the ASDL backend throws an exception when using a softmax model (I'd assume it does not).

For now, just make sure that the MAP model does not have a softmax layer at the end for the library to work properly.

Feb 22 '22 07:02 edaxberger

Laplace Laplace copied to clipboard

Clarify how the softmax is handled for classification

Laplace
Laplace copied to clipboard