debias
debias copied to clipboard
Logits vs Log-softmax scores in LearnedMixin implementation
Hi,
I had a question regarding the PyTorch implementation of LearnedMixin. https://github.com/chrisc36/debias/blob/af7f0e40f9120ae2d3081cb8a2bf4dad64a18aa7/debias/bert/clf_debias_loss_functions.py#L41
def forward(self, hidden, logits, bias, labels):
logits = logits.float() # In case we were in fp16 mode
logits = F.log_softmax(logits, 1)
factor = self.bias_lin.forward(hidden)
factor = factor.float()
factor = F.softplus(factor)
bias = bias * factor
bias_lp = F.log_softmax(bias, 1)
entropy = -(torch.exp(bias_lp) * bias_lp).sum(1).mean(0)
loss = F.cross_entropy(logits + bias, labels) + self.penalty*entropy
return loss
The forward function adds logits
and bias
variables, however, logits
has been log-softmaxed whereas bias
is not (bias
seems to be raw logits from bias-only model). Should we really apply log-softmax to logits
before sending into cross_entropy
loss? Could you explain the reasoning behind this?
Following up, as I have the same question. :) Thanks!