fairseq icon indicating copy to clipboard operation
fairseq copied to clipboard

I really want to know how to increase the speed of fairseq-generate

Open kkeleve opened this issue 2 years ago • 7 comments

Hi , after I train a vanilla transformer on wmt17zh-en, I try to translate the source of the training set(about 19m). I tried fairseq-generate and fairseq-interactive , but the speed of generating all is too slow . Is there any way to improve the speed of fairseq-generate or fairseq-interactive ?

here is the script:

CUDA_VISIBLE_DEVICES=0 fairseq-generate $data_dir \
--path $model_save/average-5.pt --gen-subset train \
--beam 5 --lenpen 1.6 --batch-size 100  > $out

image

(I also tried --max-tokens 2048 etc. But it‘s still an unacceptable speed . ) I would like to know how to improve the speed of generating back-translation when faced with back-translation of big data (such as 1m ,20m.. )

kkeleve avatar Jun 30 '22 10:06 kkeleve

I would also want to know if there is a way of speeding up inference.

For now, I guess you may have more info from him: https://github.com/facebookresearch/fairseq/issues/4478

Also his parallel script is great if you have multiple gpu.

The next thing you can try is adding --num-workers X . Though by all means, you may want to identify where is your bottleneck. (you can ask him for how to inspect those usage percentage)


Another interesting post: https://github.com/facebookresearch/fairseq/issues/3100 Maybe there is something can be done with no-repeat-ngram.

gmryu avatar Jun 30 '22 15:06 gmryu

thanks , this helped me a lot @gmryu

kkeleve avatar Jul 01 '22 02:07 kkeleve

I think the simplest method for this is to split your data into several shards and translate them with multiple works in a parallel way.

SefaZeng avatar Jul 01 '22 08:07 SefaZeng

I think the simplest method for this is to split your data into several shards and translate them with multiple works in a parallel way.

I only have one GPU at the moment, but I will try what you said

kkeleve avatar Jul 01 '22 08:07 kkeleve

You can consider running translations with CTranslate2 which accelerates Transformer inference. See this guide to convert Fairseq models.

Disclaimer: I'm the author of CTranslate2.

guillaumekln avatar Jul 01 '22 14:07 guillaumekln

Hi @kkeleve, may I ask did you do any preprocessing with the data before generating the back translation pairs, like fairseq-generate command, and if so, could you please share the command? I'm also trying to generate back translation pairs with monolingual data, but couldn't figure out how to do it.

tianshuailu avatar Jul 04 '22 09:07 tianshuailu

Hi @kkeleve, may I ask did you do any preprocessing with the data before generating the back translation pairs, like fairseq-generate command, and if so, could you please share the command? I'm also trying to generate back translation pairs with monolingual data, but couldn't figure out how to do it.

I think you can refer to this tutorial . In addition, because fairseq-generate is too slow, I am now using CTranslate2 for translation

kkeleve avatar Jul 14 '22 03:07 kkeleve