awd-lstm-lm splits cross entropy can be further optimized

splits cross entropy can be further optimized

Open ReactiveCJ opened this issue 7 years ago • 2 comments

https://github.com/salesforce/awd-lstm-lm/blob/32fcb42562aeb5c7e6c9dec3f2a3baaaf68a5cb5/splitcross.py#L137

As the word in tail, the probability of this word is p(C) * p(x=target|C), then the entropy is target * log(p(C) * p(x=target|C) = target * log(P(C)) + target + log(p(x=target|C)。

We can just add the cross entropy on the head include tombstones, then compute cross entropy on each tail, so it is no need pass head_entropy below.

Jul 13 '18 06:07 ReactiveCJ

At the line 166 of splitcross.py "entropy = -(head_entropy + tail_entropy)", we need not add head_entropy cause it have been added in logprob "results.append(head_entropy.view(-1, 1) + tail_entropy)"

Jul 13 '18 09:07 ReactiveCJ

I replaced this complicated SplitCrossEntropyLoss by the pytorch cross entropy loss which produces the same results and seems to be only slightly slower.

Oct 05 '18 17:10 octavian-ganea

awd-lstm-lm awd-lstm-lm copied to clipboard

splits cross entropy can be further optimized

awd-lstm-lm
awd-lstm-lm copied to clipboard