captum icon indicating copy to clipboard operation
captum copied to clipboard

Potential Issue of CrossEntropyLoss in Tutorial Titanic Data Analysis

Open jensqin opened this issue 3 years ago • 1 comments

📚 Documentation

In the Titanic Data Analysis tutorial, the use of CrossEntropyLoss is inappropriate. From the documentation of pytorch, CrossEntropyLoss is a combination of LogSoftmax and NLLLoss, so the last Softmax layer of TitanicSimpleNNModel is redundant.

class TitanicSimpleNNModel(nn.Module):

    def forward(self, x):
        lin1_out = self.linear1(x)
        sigmoid_out1 = self.sigmoid1(lin1_out)
        sigmoid_out2 = self.sigmoid2(self.linear2(sigmoid_out1))
        return self.softmax(self.linear3(sigmoid_out2))
    criterion = nn.CrossEntropyLoss()
    # some code

    for epoch in range(num_epochs):    
        output = net(input_tensor)
        loss = criterion(output, label_tensor)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

jensqin avatar Aug 09 '22 19:08 jensqin

Thank you for raising the question, @jensqin! It looks like we could avoid softmax here. The documentation in PyTorch though calls softmax in the examples as well: https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html?highlight=crossentropyloss#torch.nn.CrossEntropyLoss

cc: @vivekmig

NarineK avatar Aug 09 '22 21:08 NarineK