LossUpAccUp Maybe I get the reason why they go opposite ways.

Maybe I get the reason why they go opposite ways.

Open lcsama opened this issue 2 years ago • 1 comments

Hi, Here is the result when I reproducing your LeNet.ipynb . lenet

I just add a simple output = F.softmax(output,dim=-1) before loss = F.cross_entropy(output, target) and things changed.

We can easily understand the phenomenon through bayes probability.

Since this repo have been 2 years without update. If u still interest in the reason, I am glad to make pull request.

May 11 '22 08:05 lcsama

Hi, Here is the result when I reproducing your LeNet.ipynb .

I just add a simple output = F.softmax(output,dim=-1) before loss = F.cross_entropy(output, target) and things changed.

We can easily understand the phenomenon through bayes probability.

Since this repo have been 2 years without update. If u still interest in the reason, I am glad to make pull request.

Hi, lcsama

I had the same problem and the testloss sometimes increased to inf. I'm very interested in your results and wonder why you add F.softmax before F.cross_entropy. I would have thought that the input to F.cross_entropy was logits, and the softmax was also included in F.cross_entropy.

Here is the code for F.cross_entropy：

def cross_entropy(input, target, weight=None, size_average=None, ignore_index=-100,
                  reduce=None, reduction='mean'):
    if size_average is not None or reduce is not None:
        reduction = _Reduction.legacy_get_string(size_average, reduce)
    return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)

May 19 '22 12:05 honeysuckle-lm

LossUpAccUp LossUpAccUp copied to clipboard

Maybe I get the reason why they go opposite ways.

LossUpAccUp
LossUpAccUp copied to clipboard