image-similarity-deep-ranking icon indicating copy to clipboard operation
image-similarity-deep-ranking copied to clipboard

triplt_loss

Open longzeyilang opened this issue 6 years ago • 9 comments

Hi, I think there is some problem about your triplt_loss function Here is your function: __EPSILON = K.epsilon() def _loss_tensor(y_true, y_pred): y_pred = K.clip(y_pred, _EPSILON, 1.0-EPSILON) loss = tf.convert_to_tensor(0,dtype=tf.float32) g = tf.constant(1.0, shape=[1], dtype=tf.float32) for i in range(0,batch_size,3): try: q_embedding = y_pred[i+0] p_embedding = y_pred[i+1] n_embedding = y_pred[i+2] D_q_p = K.sqrt(K.sum((q_embedding - p_embedding)**2)) D_q_n = K.sqrt(K.sum((q_embedding - n_embedding)**2)) loss = (loss + g + D_q_p - D_q_n )
except: continue loss = loss/(batch_size/3) zero = tf.constant(0.0, shape=[1], dtype=tf.float32) return tf.maximum(loss,zero)
I think D_q_p - D_q_n may be less than 0,which cause total loss incorrect, it should be defined as follows: def triplt_loss(y_pred): loss=tf.convert_to_tensor(0,dtype=tf.float32) total_loss=tf.convert_to_tensor(0,dtype=tf.float32) g=tf.constant(1.0,shape=[1],dtype=tf.float32) zero=tf.constant(0.0,shape=[1],dtype=tf.float32) for i in range(0,batch_size,3): try: q_embedding=y_pred[i] p_embedding=y_pred[i+1] n_embedding=y_pred[i+2] D_q_p=K.sqrt(K.sum((q_embedding-p_embedding)**2)) D_q_n=K.sqrt(K.sum((q_embedding-n_embedding)**2)) loss=tf.maximum(g+D_q_p-D_q_n,zero) total_loss=total_loss+loss except: continue total_loss=total_loss/(batch_size/3) return total_loss

longzeyilang avatar Jul 06 '18 04:07 longzeyilang

Hey, Longzeyilang! You are right! Dunno who I missed that. Thank you for pointing that! I am out of town for a few days, so won't be able to make changes for now. Could you please raise a PR for this once you test the code?

akarshzingade avatar Jul 06 '18 05:07 akarshzingade

@akarshzingade Thank you ! I still have some problem about object function of the original. you miss adding λ||w||2 to the object function, but I do not still known w, where is from?

longzeyilang avatar Jul 06 '18 06:07 longzeyilang

@longzeyilang That is the regularisation added to the loss function. 'W' is the weights of the whole network. (Formally, it is the parameters of the whole network as a function)

akarshzingade avatar Jul 08 '18 09:07 akarshzingade

@akarshzingade It means that you forget to add the regularisation to the loss function. or I add extra parameter like: first_conv = Conv2D(96, kernel_size=(8, 8),strides=(16,16), padding='same', kernel_regularizer=TruncatedNormal(stddev=0.01))(first_input)

or how to add it?

longzeyilang avatar Jul 09 '18 01:07 longzeyilang

I didn't forget. I was experimenting how the model works without regularisation back then. It performed pretty decently.

Ideally, you add it in the loss. But, in Keras, you would add it to the layers using the kernel_regularizer parameter like you've mentioned. As far as I remember, the authors use squared L2 norm for the regularization. Please check the paper once. I am still travelling, can't check it right now. Keras offers l2 regularisation.

akarshzingade avatar Jul 09 '18 04:07 akarshzingade

@akarshzingade Thank you! Have a nice trip

longzeyilang avatar Jul 09 '18 05:07 longzeyilang

@longzeyilang will you be making a pull request with your version of the loss function? I have found that yours produces better results. Thank you both!

cesarandreslopez avatar Jul 18 '18 16:07 cesarandreslopez

@cesarandreslopez please see above

longzeyilang avatar Jul 22 '18 01:07 longzeyilang

Hi, @akarshzingade So, which one is the correct loss? Is "K.clip(y_pred, _EPSILON, 1.0-EPSILON)" necessary?

BlueAnthony avatar Apr 12 '19 09:04 BlueAnthony