mwer
mwer copied to clipboard
Have you reproduced results mentioned in the paper
thanks for sharing implementation of mwer, Have you reproduced results mentioned in the paper of Rohit Prabhavalkar etc. MINIMUM WORD ERROR RATE TRAINING FOR ATTENTION-BASED SEQUENCE-TO-SEQUENCE MODELS
My baseline model is very different from the one in this paper. However, over-fit experiment shows that the loss function works well.