Deep-Reinforcement-Learning-for-Dialogue-Generation-in-tensorflow
Deep-Reinforcement-Learning-for-Dialogue-Generation-in-tensorflow copied to clipboard
请问你有遇过reward爆炸的情况吗?
我只用了 Ease of answering 作为reward,但是随着训练这一项从-2.x开始一直减小到负无穷。
@Tangzy7 您好,请问您是否解决 expected int32 got list containing tensors of type '_message' instead 问题?不吝赐教,谢谢。