Michael Denkowski comments

Results 27 comments of


Michael Denkowski

AssertionError: If capturable=False, state_steps should not be CUDA tensors.

Hi Samuel, It looks like the error is shortly after Sockeye reloads the best checkpoint. There's a [reported issue](https://github.com/pytorch/pytorch/issues/80809) for PyTorch 1.12.0 where loading checkpoints causes this type of error....

Add kNN-MT

> re: Naming -- Maybe let's do "store"? This establishes some connection with the original paper's notion of "create a datastore" (but IMO "datastore" is too vague of a concept...

translation speed with quantization into int8

The benchmarks in the paper run a WMT17 En-De big transformer with batch size 1 on a c5.2xlarge EC2 instance. Differences in any of these dimensions can lead to different...

[Review #1051 First] Refactoring Ahead of Adding DeepSpeed Support

Thanks Tobi and Felix! I've made some improvements based on your feedback.

About the Arxiv benchmark and the paper

Hi Vincent, Sockeye saves the training state separately from the model parameters. The `training_state` directory contains state files for the optimizer, data iterator, etc. At each checkpoint, Sockeye saves the...

About the Arxiv benchmark and the paper

Thanks for sharing these settings! With the updated frequencies, OpenNMT's vocabulary size matches Sockeye's. We set the batch size for each toolkit to leave enough free GPU memory to avoid...

About the Arxiv benchmark and the paper

It looks like using "noam" decay with learning rate "2" gives us the right learning schedule. We'll run a benchmark with the updated settings.

About the Arxiv benchmark and the paper

Hi Vincent, Running with your recommended settings results in faster training and higher BLEU scores: - WMT17 En-De: 13.7 hours, 35.2 BLEU - WMT17 Ru-En: 39.4 hours, 32.2 BLEU We'll...

About the Arxiv benchmark and the paper

Yes, we've updated the config file to include all of your recommendations including batch size 5000 and update interval 10. This is the log for the WMT17 En-De model: [onmt_train_wmt17_en_de.log](https://github.com/awslabs/sockeye/files/9182735/onmt_train_wmt17_en_de.log).

About the Arxiv benchmark and the paper

We've updated the paper with the new results: [Sockeye 3: Fast Neural Machine Translation with PyTorch](https://arxiv.org/abs/2207.05851).