Deep-Reinforcement-Learning-for-Dialogue-Generation-in-tensorflow 请问你有遇过reward爆炸的情况吗？

请问你有遇过reward爆炸的情况吗？

Open Tangzy7 opened this issue 6 years ago • 1 comments

我只用了 Ease of answering 作为reward，但是随着训练这一项从-2.x开始一直减小到负无穷。

Jun 16 '18 09:06 Tangzy7

@Tangzy7 您好，请问您是否解决 expected int32 got list containing tensors of type '_message' instead 问题？不吝赐教，谢谢。

Jun 21 '18 08:06 Sunnee2018