Abi See

Results 36 comments of Abi See

@rahul-iisc I've had another look at the code. I see your point about > OOV part of vocab is max_art_oov long. Not all the sequences in a batch will have...

I've looked further into this and still don't understand where the NaNs are coming from. I changed the code to detect when a NaN occurs, then dump the attention distribution,...

Hello everyone, and thanks for your patience. We've made a few changes that help with the NaN issue. * We [changed](https://github.com/abisee/pointer-generator/commit/d08c4c5cc358a0e9bdeebb46e47885cd8cdb2760) the way the log of the final distribution is...

I tend to see this kind of output in the earlier phases of training (i.e. when the model is still under-trained). Look at the loss curve on tensorboard -- has...

@makcbe Yes, the `eval` mode is designed to be run concurrently with `train` mode. The idea is you can see the loss on the validation set plotted alongside the loss...

Hi @LilyZL 1. Yes, repetition is very common (it is one of the two big things we are aiming to fix as noted in the [ACL paper](https://arxiv.org/abs/1704.04368)). That's what the...

Hi @fishermanff Yes, running `run_summarization.py` in train mode should restore your last training checkpoint. I think it's handled by the [supervisor](https://github.com/abisee/pointer-generator/blob/master/run_summarization.py#L133).

We do not plan to add that to this repo, but it should be fairly straightforward to copy that functionality from the TextSum code.

Hi @anubhavmax, the same question has been asked [here](https://github.com/abisee/pointer-generator/issues/21). Yes - the pointer-generator model produces mostly extractive summaries. This is discussed in section 7.2 of the [paper](https://arxiv.org/pdf/1704.04368.pdf). It is the...

Yes, RNNs are very slow to train, especially for long sequences (such as in this project), due to the sequential nature of the recurrent connections. I assume by "brute force",...