Hui Chen
Hui Chen
I encountered the same problem on deepspeed version 0.14.2.
I also encountered this issue. Have you found any solutions?
Have you tried to add ```predictions = np.argmax(predictions, axis=-1)``` before decoding? Current prediction shape looks like (batch_size, length, vocabulary_size). We should require a shape like (batch_size, length).
Hi @JiuhaiChen , I also encountered this question. Did you solve it?