Atul Kumar

Results 21 comments of Atul Kumar

Thanks for reviewing the code. I have fixed the bug. https://github.com/atulkum/pointer_summarizer/blob/master/training_ptr_gen/train.py#L91 https://github.com/atulkum/pointer_summarizer/blob/master/training_ptr_gen/train.py#L100

I have turned on is_coverage=True after training for 500k iteration. Making is_coverage=True from the beginning makes the training unstable.

You are right about increasing branches computation graph but it won't cause NaN. If you are getting NaN then it might be somewhere else. I tested it on pytorch 0.4...

After how many iteration (with is_coverage = True) you are getting NaN? Did you initialize the model_file_path in the code? https://github.com/atulkum/pointer_summarizer/blob/master/training_ptr_gen/train.py#L141 You can try to debug it on CPU. My...

I have uploaded a model [here](https://drive.google.com/open?id=1luUphx8Glc7uSPhKZuvvF8PiH0XR6EdC). I retrain it with with is_coverage = True for 200k iteration but did not get NaN For retraining you should do 3 things: 1)...

```>& log/training_log``` simply redirect the output to the file 'log/training_log' ```&``` at the end is for running the program in background. You might have ```training_log ``` directory created in ```log```...

2GB is too low. you can do 2 things: 1) use pre-trained embedding, extract embedding vector on cpu and don't load embedding into gpu. 2) use less number of encoding...

Yes in the paper it is not mentioned anywhere but the code has it. https://github.com/abisee/pointer-generator/blob/master/attention_decoder.py#L150

I get the paper where similar kind of attention mechanism is used. [Order Matters: Sequence to sequence for sets](https://arxiv.org/abs/1511.06391)

Thanks for pointing this out. You are right. I have updated my code, I still need to re-run the experiments though. I will update the result after that. Here is...