practical-torchtext icon indicating copy to clipboard operation
practical-torchtext copied to clipboard

Possible bug when calculating running loss

Open HenryDashwood opened this issue 5 years ago • 4 comments

Hi there!

I've been looking at your training loop in the text classification tutorial and I see that you calculate a total running loss before averaging to get the average loss after the loop ends.

Looking at this line though I'm a little confused running_loss += loss.data[0] * x.size(0) When I print out the shape of x I get these values Screen Shot 2019-05-08 at 1 29 50 PM The batch sizes appear to be at index 1 so should the running loss line read running_loss += loss.data[0] * x.size(1)?

Entirely possible that I've made the mistake here but thought I would ask anyway!

HenryDashwood avatar May 08 '19 17:05 HenryDashwood

Hi there!

I've been looking at your training loop in the text classification tutorial and I see that you calculate a total running loss before averaging to get the average loss after the loop ends.

Looking at this line though I'm a little confused running_loss += loss.data[0] * x.size(0) When I print out the shape of x I get these values Screen Shot 2019-05-08 at 1 29 50 PM The batch sizes appear to be at index 1 so should the running loss line read running_loss += loss.data[0] * x.size(1)?

Entirely possible that I've made the mistake here but thought I would ask anyway!

hello ,i have the same issue , have you ever solved this issue? just running_loss += loss.data[0] * x.size(1) is ok ?

rxc205 avatar May 29 '19 14:05 rxc205

Hi there!

I've been looking at your training loop in the text classification tutorial and I see that you calculate a total running loss before averaging to get the average loss after the loop ends.

Looking at this line though I'm a little confused running_loss += loss.data[0] * x.size(0) When I print out the shape of x I get these values Screen Shot 2019-05-08 at 1 29 50 PM The batch sizes appear to be at index 1 so should the running loss line read running_loss += loss.data[0] * x.size(1)?

Entirely possible that I've made the mistake here but thought I would ask anyway! running_loss+=loss.item() * x.size(0), works!!!

rxc205 avatar Jun 03 '19 12:06 rxc205

I ended up using .size(1). It seems to work

HenryDashwood avatar Jun 03 '19 14:06 HenryDashwood

x.size(0) is the lengths of comment_text for the current batch, which have been automatically padded by torchtext in advance. The meaning of multiplying x.size(0) in scalar loss function is not clear. Maybe the author consider it something like batch weights ? But still NO sense.

gitfourteen avatar Aug 21 '19 12:08 gitfourteen