Zewei Chu

Results 3 comments of Zewei Chu

Thanks for doing that! Would you send a pull request to this repo with the added F1 scores in "section Experiments"? You may add your name somewhere in the end...

这个在训练的时候主要是为了防止sequence太长,back prop距离太远,内存会不够用。如果在hidden位置截断gradient就不会一路back prop回去了。

Hi, the work was performed during my internship at Google, so I don't have access to the baseline models now. However, most models use Tensor2Tensor library, https://github.com/tensorflow/tensor2tensor so it should...