mixup-generator Difference with the original paper

Difference with the original paper

Open melaanya opened this issue 5 years ago • 1 comments

Hi @yu4u! Thank you for your work!

After studying the repo, I still have one question about label processing. In the original implementation , the processing of mixing up for labels happens at the time of loss computing:

def mixup_criterion(criterion, pred, y_a, y_b, lam):
    return lam * criterion(pred, y_a) + (1 - lam) * criterion(pred, y_b)

In your implementation, you're mixing up the labels:

y1 = self.y_train[batch_ids[:self.batch_size]]
y2 = self.y_train[batch_ids[self.batch_size:]]
y = y1 * y_l + y2 * (1 - y_l)

After inserting the resulting labels into the equation even for binary_cross_entropy, the resulting equation isn't the same. So, the question is, what was the motivation for changing the place for performing the mixup for labels?

Jun 28 '19 10:06 melaanya

The formulation in the original paper is that form. Please refer to the original paper.

https://arxiv.org/pdf/1710.09412.pdf

Jul 02 '19 09:07 yu4u

mixup-generator mixup-generator copied to clipboard

Difference with the original paper

mixup-generator
mixup-generator copied to clipboard