Abi See comments

Results 36 comments of


                                            Abi See

Final tensorflow loss for results reported in paper

@Spandan-Madan looking at the smoothed loss curve according to Tensorboard, the training loss was about 2.8 after about 230k iterations, before we turned on coverage.

Final tensorflow loss for results reported in paper

@Spandan-Madan "coverage" is one of the main ideas of the paper. See also [these flags](https://github.com/abisee/pointer-generator/blob/master/run_summarization.py#L63).

wrong for increasing hidden_dim,emb_dim

You can't (easily) increase those things. You've been training a model that is performing matrix transformations on hidden vectors of size 64. You can't use that model to handle hidden...

length of enc_batch and dec_batch for axis=1

Not sure I understand the question. `max_dec_steps` refers to the maximum number of steps we will run the decoder RNN. This is the same thing as "max number of abstract...

Questions for p_gens

You're right, the comments were wrong. Now [fixed](https://github.com/abisee/pointer-generator/commit/0cdcaeeaf8f42d4d64ec2ed09eb2f0158cd0db8f).

The question about coverage loss

Yes, the coverage vector **c** increases monotonically and is unbounded. It is possible for **sum(min(a,c))** to be 1 forever if the attention **a** always attends to something that's already been...

The question about coverage loss

@Gandor26 For intuition see the [blog post](http://www.abigailsee.com/2017/04/16/taming-rnns-for-better-summarization.html).

Question about generated summary

That padding code is for the decoder inputs and targets during _training_, not test-time decoding. During decoding, the decoder is run one step at a time with beam search, and...

Question about generated summary

Yes, the pointer-generator model is able to produce UNK tokens during decoding. UNK is part of the vocabulary object and the pointer-generator decoder has access to the whole vocabulary.

Question about generated summary

1. We very rarely / perhaps never see UNKs in the output of the pointer-generator model. However, this is mostly because at test time, the pointer-generator model acts in pointing...