Yen-Chun Chen

Results 7 comments of Yen-Chun Chen

This might be a bug. I did not test 2 layers since 1 layer already give good results. I will investigate this ASAP.

Thanks for pointing this. I will investigate this.

Even if there's only one sentence [batch_size](https://github.com/ChenRocks/fast_abs_rl/blob/aebf539107caba5be35720f5d1f9f98989a069e8/data/batcher.py#L115) would be 1 and the function properly handles this. Do you have the correct data format?

Thanks for the explanation. I understand the issue now. If you are working on a dataset that only has 1-sentence summary output this repository might not be the best choice...

Thanks for pointing this out! I think your solution should work as intended. I will test how this affect the results when I have time.

Currently there is no such a function. The training should finish in reasonable time using GPU and I currently have no plan for supporting this. Feel free to contribute and...

You should not see the apex loss scaler reducing the loss scale to less than 1. ``` [1,0]:Gradient overflow. Skipping step, loss scaler 5 reducing loss scale to 4.3601508761683463e-106 ```...