Orig1n

Results 2 issues of Orig1n

May the code "new_state = (new_state, new_attns, new_attn_states)" should be written to "new_state = (new_state, output, new_attn_states)"?

1. 计算的维度错了 2. pooling需要使用实际序列长度,不能直接在整个时间序列维度上reduce。建议引入mask