xnmt icon indicating copy to clipboard operation
xnmt copied to clipboard

Auxiliary loss: usage unclear

Open msperber opened this issue 6 years ago • 2 comments

I found that it's no longer enough to implement on_calc_additional_loss() to add an additional loss in my model, but that the loss calculator needs to be changed, as well.

Is the FeedbackLoss the intended loss calculator for this? The code suggests that it is, but I find the name quite confusing. I'm actually not interested in computing a loss based on another loss, just add an independent second loss. This is a very common thing to do so I think we should make it intuitive to do so.

However, I'm wondering if it's intended not to compute additional losses by default (i.e. with the default loss calculator)? If so, that should be documented, but it doesn't seem super intuitive to me.

msperber avatar Aug 25 '18 12:08 msperber

Any hints, maybe @philip30 ? I'll be happy to fix this myself but don't want to undo any previous design decisions and would like to know what the best way is:

  • Is there a reason why MLELoss does not compute auxiliary losses? Should we add it back in there? Or maybe at least as an optional feature?
  • or, rename FeedbackLoss to something more general?
  • or, since FeedbackLoss is a bit special because of the repeat feature, maybe introduce a AuxiliaryLoss and then add a convenience class that is a composite loss with mle loss and auxiliary loss as child classes?

Also, there seems to be a bug in case of more than one auxiliary loss. In this case, both are combined via + in https://github.com/neulab/xnmt/blob/master/xnmt/events.py#L111 but this operation is not defined for FactoredLossExpr.

msperber avatar Aug 28 '18 15:08 msperber

Hey @msperber , @neubig was the one who designed this, but I can answer few questions:

  1. on_calc_additional_loss takes the previous loss calculation as a parameter where the calc_loss takes src and trg. Perhaps the calc_additional_loss should be changed into calc_feedback_loss for clarity.
  2. FeedbackLoss does this by first calculating the loss which is usually mle and feed it back as parameter of the calc_additional_loss. So the name is appropriate.

I would not have time to fix this since we are currently running urgent experiment until the end of the month, so I would strongly recommend delaying fixing this unless it creates major problem in our experiment. In my case, I use the CompositeLoss, combining GlobalFertilityLoss and FeedbackLoss(with MLE child). Do you think that will cause a problem?

philip30 avatar Aug 29 '18 02:08 philip30