mixup-generator
mixup-generator copied to clipboard
Difference with the original paper
Hi @yu4u! Thank you for your work!
After studying the repo, I still have one question about label processing. In the original implementation , the processing of mixing up for labels happens at the time of loss computing:
def mixup_criterion(criterion, pred, y_a, y_b, lam):
return lam * criterion(pred, y_a) + (1 - lam) * criterion(pred, y_b)
In your implementation, you're mixing up the labels:
y1 = self.y_train[batch_ids[:self.batch_size]]
y2 = self.y_train[batch_ids[self.batch_size:]]
y = y1 * y_l + y2 * (1 - y_l)
After inserting the resulting labels into the equation even for binary_cross_entropy, the resulting equation isn't the same. So, the question is, what was the motivation for changing the place for performing the mixup for labels?
The formulation in the original paper is that form. Please refer to the original paper.
https://arxiv.org/pdf/1710.09412.pdf