Results 13 comments of gaebor

@aKzenT is right about the binary/text file handling thing. https://msdn.microsoft.com/en-us/library/h9t88zwz.aspx I'm surprised that I haven't ran into it. I should correct it..

@aKzenT I see now, the problem is the reading/writing of the binary via the console. I think an `-input` and `-output` command line argument would help the `shuffle` tool. The...

Yes, just as mentioned above. I'll try to make the modifications for the shuffle but it hasn't been a priority for me.

I had the same problem with the provided model: http://download.tensorflow.org/models/nmt/10122017/ende_gnmt_model_8_layer.zip I used the script: https://github.com/tensorflow/nmt/blob/master/nmt/scripts/wmt16_en_de.sh The resulted vocabulary had 36549 elements while the pre-trained model has 36548! Assign requires shapes...

FYI I rolled back the `mosesdecoder` to commit 5b9a6da9a4065b776d1dffedbd847be565c436ef and `subword-nmt` to 3d28265d779e9c6cbb39b41ba54b2054aa435005. The resulted vocabulary was the right size so at least the checkpoint worked but the test BLEU...

thanks @christ1ne but that has the same problem than the one I tried: without the vocabulary the models are useless.

I'd be damned if I haven't tried it, see: https://github.com/tensorflow/nmt/issues/415#issuecomment-482791575

I beg to differ: https://github.com/mlperf/training/blob/master/rnn_translator/download_dataset.sh#L147 https://github.com/tensorflow/nmt/blob/master/nmt/scripts/wmt16_en_de.sh#L129 But I guess it can't hurt trying