xnmt
xnmt copied to clipboard
Auxiliary loss: usage unclear
I found that it's no longer enough to implement on_calc_additional_loss()
to add an additional loss in my model, but that the loss calculator needs to be changed, as well.
Is the FeedbackLoss
the intended loss calculator for this? The code suggests that it is, but I find the name quite confusing. I'm actually not interested in computing a loss based on another loss, just add an independent second loss. This is a very common thing to do so I think we should make it intuitive to do so.
However, I'm wondering if it's intended not to compute additional losses by default (i.e. with the default loss calculator)? If so, that should be documented, but it doesn't seem super intuitive to me.
Any hints, maybe @philip30 ? I'll be happy to fix this myself but don't want to undo any previous design decisions and would like to know what the best way is:
- Is there a reason why
MLELoss
does not compute auxiliary losses? Should we add it back in there? Or maybe at least as an optional feature? - or, rename
FeedbackLoss
to something more general? - or, since
FeedbackLoss
is a bit special because of the repeat feature, maybe introduce aAuxiliaryLoss
and then add a convenience class that is a composite loss with mle loss and auxiliary loss as child classes?
Also, there seems to be a bug in case of more than one auxiliary loss. In this case, both are combined via +
in https://github.com/neulab/xnmt/blob/master/xnmt/events.py#L111 but this operation is not defined for FactoredLossExpr
.
Hey @msperber , @neubig was the one who designed this, but I can answer few questions:
- on_calc_additional_loss takes the previous loss calculation as a parameter where the calc_loss takes src and trg. Perhaps the calc_additional_loss should be changed into calc_feedback_loss for clarity.
- FeedbackLoss does this by first calculating the loss which is usually mle and feed it back as parameter of the calc_additional_loss. So the name is appropriate.
I would not have time to fix this since we are currently running urgent experiment until the end of the month, so I would strongly recommend delaying fixing this unless it creates major problem in our experiment. In my case, I use the CompositeLoss, combining GlobalFertilityLoss and FeedbackLoss(with MLE child). Do you think that will cause a problem?