Laplace
Laplace copied to clipboard
Clarify how the softmax is handled for classification
Currently it's not really clear how the final softmax is dealt with in the classification case, which might lead to confusion / unintentional misuse of the library.
There's two things to clarify:
- That the MAP model put into
Laplace
shouldn't apply a softmax (either via ann.Sofmax()
layer in the model or aF.softmax()
call in the overwritten forward pass) but return the logits instead. This could probably most easily be fixed by clarifying it in the documentation/readme and additionally raising a warning if the model outputs on the training set duringfit()
lie in [0,1] and sum to 1. - That the
Laplace
model applies the softmax internally when making predictions and that, therefore, the user shouldn't apply another softmax on top. Here we can probably only improve the documentation.
Currently it raises an Exception like Extension saving to kflr does not have an extension for Module <class 'torch.nn.modules.activation.Softmax'>
:(
@youkaichao Thanks for letting us know -- this exception comes from the BackPACK backend we're using to compute second-order information; not sure if the ASDL backend throws an exception when using a softmax model (I'd assume it does not).
For now, just make sure that the MAP model does not have a softmax layer at the end for the library to work properly.