tensorflow-triplet-loss
tensorflow-triplet-loss copied to clipboard
Distance Collapse Or Explosion Prevention
Hi,
I have made some experiments about batch all with 70 million triplet samples from our own vehicle tracking system.
Then, I found a loss is converging to a margin, while distances are collapsing or exploding. In the worst case, all the distances get close to 0, in which case such a loss converges to a margin, and no progress could be made further.
Such a model can not be reused even for verification because a positive distance is not guaranteed to be less than a margin, so is negative.
I think this is because the loss function is not enough.
triplet_loss = anchor_positive_dist - anchor_negative_dist + margin
This does not penalize when both distances of positive and negative collapse or explode together because it is a subtraction.
So, I am experimenting this version:
if balanced:
# add two more loss elements to prevent distances from collapsing or exploding
loss_positive = anchor_positive_dist * margin
loss_negative = margin / anchor_negative_dist
triplet_loss += loss_positive + loss_negative
https://github.com/ggsato/tensorflow-triplet-loss/blob/triplet_for_trackings/model/triplet_loss.py#L198
Still under experiments, but so far, a training with the balanced one described above works better. A positive mean distance keeps stay around 0.03, while a negative mean distance is forced to be apart beyond 4.0 or more.
Did anyone face with this issue? And what's your workaround?
I finished running search_hyperparams to see this effects.
Here's one result at the learning rate = 1e-4. Those results in the balanced folder are done by this improved version.

As you can see, the mean positive distance with the original loss keeps increasing while this version stayed lower. The mean negative distance made not much differences.
A positive mean distance keeps stay around 0.03, while a negative mean distance is forced to be apart beyond 4.0 or more.
This calculation was done on the whole, including invalid ones. So I fixed to get them done only by valid ones, and from which those numbers in the picture came from.
Hello,l very agree with your analysis. Because i have the same problem, but my result is randomly and maybe distance equal margin. And every time has different results.
Hello, any progress in this issue; or any relevant literature helping to prevent such collapse?