texar icon indicating copy to clipboard operation
texar copied to clipboard

Beam Search Is very Slow in Transformer

Open santhoshkolloju opened this issue 5 years ago • 2 comments

I have been using beam size of 3 and alpha 1.0 for beam search decoding looks like it is very slow . Greedy search takes around 30-40 seconds for generating a sequence of length 250 words. but beam search takes around 2 minutes ,

Can you help me improve the inference . i tried quantising the model to 8bits it decreased the size of the model but inference time still remains the same.

Any help is appreciated.

Thanks

santhoshkolloju avatar Apr 17 '19 15:04 santhoshkolloju

The transformer beam-search is adapted from the official implementation (tensor2tensor). Not sure how it can speed up.

A possible way would be using a more efficient variant of transformer decoder (e.g., TransformerXL). We don't have the bandwidth at this point though. Any contributions are welcome

ZhitingHu avatar Apr 21 '19 04:04 ZhitingHu

Same question.

guotong1988 avatar Nov 16 '20 03:11 guotong1988