DSLP icon indicating copy to clipboard operation
DSLP copied to clipboard

Deeply Supervised, Layer-wise Prediction-aware (DSLP) Transformer for Non-autoregressive Neural Machine Translation

Results 9 DSLP issues
Sort by recently updated
recently updated
newest added

I trained the model "nat_ctc_sd_ss" with the command in the README.md on Tesla V100 GPU, but i got **Out of memory** problem. Is there anything to be changed? My train...

ninja: build stopped: subcommand failed. Traceback (most recent call last): File "/home/nihao/anaconda3/envs/DSLP/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1673, in _run_ninja_build env=env) File "/home/nihao/anaconda3/envs/DSLP/lib/python3.7/subprocess.py", line 512, in run output=stdout, stderr=stderr) subprocess.CalledProcessError: Command '['ninja', '-v']' returned...

Hi Chengyang, thanks for your great code! I'm trying to reproduce the GLAT+DSLP model, I checked your given training scripts, but I found there is no "--arch glat_sd" registered model...

[/home/nihao/nihao-users2/yuhao/DSLP/env/ctcdecode/ctcdecode/src/ctc_beam_search_decoder.cpp:32] FATAL: "(probs_seq[i].size()) == (vocabulary.size())" check failed. The shape of probs_seq does not match with the shape of the vocabulary [/home/nihao/nihao-users2/yuhao/DSLP/env/ctcdecode/ctcdecode/src/ctc_beam_search_decoder.cpp:32] FATAL: "(probs_seq[i].size()) == (vocabulary.size())" check failed. The shape of...

I use follow hyperparameters run on iwslt14, but it seems performs bad. result show only bleu 26 on iwslt14, Does anyone know the appropriate hyperparameters for the iwslt dataset? thanks...

Hi, thank you for releasing the code! I have a question about the given bash scripts of training and inference. The training scripts of the CMLM+DSLP `python3 train.py data-bin/wmt14.en-de_kd --source-lang...

The train command i used: python3 train.py data-bin/wmt14.en-de_kd --source-lang en --target-lang de --save-dir checkpoints --eval-tokenized-bleu \ --keep-interval-updates 5 --save-interval-updates 500 --validate-interval-updates 500 --maximize-best-checkpoint-metric \ --eval-bleu-remove-bpe --eval-bleu-print-samples --best-checkpoint-metric bleu --log-format simple...

hi,thankyou for release code! I have a question about the different pipline between train and inference 。the paper says that in inference stage the predict out of every decoder layer...