training_results_v0.6 NVIDIA v0.6 transformer implementation benchmark data download

NVIDIA v0.6 transformer implementation benchmark data download

Open dagrayvid opened this issue 5 years ago • 0 comments

This is regarding training_results_v0.6/NVIDIA/benchmarks/transformer/implementations/pytorch/

I'm facing many errors when attempting to run bash run_preprocessing.sh && bash run_conversion.sh. I am running the scripts inside of the NGC container. The first error I face is that the urls no longer work in the lines

wget https://raw.githubusercontent.com/tensorflow/models/master/official/transformer/test_data/newstest2014.en -O /workspace/translation/examples/translation/wmt14_en_de/newstest2014.en
wget https://raw.githubusercontent.com/tensorflow/models/master/official/transformer/test_data/newstest2014.de -O /workspace/translation/examples/translation/wmt14_en_de/newstest2014.de

I replaced them with cp newstest2014.* /workspace/translation/examples/translation/wmt14_en_de/ as the files are already in the pytorch directory.

The next error is an import error from: from mlperf_log_utils import mlperf_print, mlperf_submission_log, set_seeds, get_rank.

set_seeds is not defined in mlperf_log_utils.py. I simply removed the import of set_seeds as it is not used in preprocess.py anyway. I did the same thing in preprocess_fairseq.py.

After that one more error remained from the line

mlperf_log.ROOT_DIR_TRANSFORMER = os.path.dirname(os.path.realpath(__file__))

NameError: name 'mlperf_log' is not defined

Importing mlperf_log_utils and replacing mlperf_log with mlperf_log_util allows the preprocessing to run. After the scripts ran I was able to run the benchmark with DATADIR set to examples/translation/wmt14_en_de/utf8/, however doing these hacks to get it running makes me think I must be doing something wrong.

Are there further instructions I'm missing about getting the data for this benchmark? I'm hoping to reproduce the published results on a DGX1.

Aug 20 '19 19:08 dagrayvid

training_results_v0.6 training_results_v0.6 copied to clipboard

NVIDIA v0.6 transformer implementation benchmark data download

training_results_v0.6
training_results_v0.6 copied to clipboard