pointer-generator icon indicating copy to clipboard operation
pointer-generator copied to clipboard

Anyone Change the Loss Function?

Open landmann opened this issue 7 years ago • 0 comments

I want to change the loss function to penalize the embedding distance instead of the NLL of the actual word. For that reason, I've changed this portion of the code

  if FLAGS.pointer_gen:
    # Calculate the loss per step
    loss_per_step = [] # will be list length max_dec_steps containing shape (batch_size)
    batch_nums = tf.range(0, limit=hps.batch_size) # shape (batch_size)
    for dec_step, dist in enumerate(final_dists):
      targets = self._target_batch[:,dec_step] # The indices of the target words. shape (batch_size)
      indices = tf.stack( (batch_nums, targets), axis=1) # shape (batch_size, 2)
      gold_probs = tf.gather_nd(dist, indices) # shape (batch_size). prob of correct words on this step
      losses = -tf.log(gold_probs)
      loss_per_step.append(losses)`

To this

  if FLAGS.pointer_gen:
    # Calculate the loss per step
    loss_per_step = [] # will be list length max_dec_steps containing shape (batch_size)
    batch_nums = tf.range(0, limit=hps.batch_size) # shape (batch_size)
    for dec_step, dist in enumerate(final_dists):
      targets = self._target_batch[:,dec_step] # The indices of the target words. shape (batch_size) 
      target_embeddings = tf.nn.embedding_lookup(embedding, targets)
      pred_indices = tf.argmax(dist, axis=0) # Indices of the words to be outputted. shape (batch_size)
      pred_embeddings = tf.nn.embedding_lookup(embedding, pred_indices)
      losses = tf.reduce_sum(tf.square(target_embeddings - pred_embeddings), axis=1) #L2 norm
      loss_per_step.append(losses)`

But I'm getting the following error:

    ([str(v) for _, _, v in converted_grads_and_vars],)) 
    ValueError: No gradients provided for any variable:...`

Any idea of how I can go about this?

landmann avatar May 04 '18 23:05 landmann