debias Logits vs Log-softmax scores in LearnedMixin implementation

Logits vs Log-softmax scores in LearnedMixin implementation

Open erobic opened this issue 4 years ago • 1 comments

Hi,

I had a question regarding the PyTorch implementation of LearnedMixin. https://github.com/chrisc36/debias/blob/af7f0e40f9120ae2d3081cb8a2bf4dad64a18aa7/debias/bert/clf_debias_loss_functions.py#L41

def forward(self, hidden, logits, bias, labels):
    logits = logits.float()  # In case we were in fp16 mode
    logits = F.log_softmax(logits, 1)

    factor = self.bias_lin.forward(hidden)
    factor = factor.float()
    factor = F.softplus(factor)

    bias = bias * factor

    bias_lp = F.log_softmax(bias, 1)
    entropy = -(torch.exp(bias_lp) * bias_lp).sum(1).mean(0)

    loss = F.cross_entropy(logits + bias, labels) + self.penalty*entropy
    return loss

The forward function adds logits and bias variables, however, logits has been log-softmaxed whereas bias is not (bias seems to be raw logits from bias-only model). Should we really apply log-softmax to logits before sending into cross_entropy loss? Could you explain the reasoning behind this?

May 22 '20 14:05 erobic

Following up, as I have the same question. :) Thanks!

Oct 01 '20 14:10 ddemszky

debias debias copied to clipboard

Logits vs Log-softmax scores in LearnedMixin implementation

debias
debias copied to clipboard