neuralmonkey
neuralmonkey copied to clipboard
Beam search decoder with attention model does not work
Hi,
I get an error when I train a model that uses attention and a beam search decoder. On the other hand, these two scenarios work fine:
- Attention, but no beam search decoder (greedy decoder).
- No attention, but beam search decoder is used.
The exception message mentions 'Incompatible shapes'. Please find the error log and the training config attached. Is there a problem with the config file.
Hi, thanks for letting us know. Beam search with attention currently only works with batch size 1. The workaround is setting runners_batch_size to 1 in the [main] section of the configuration file.
In general, I don't think there are any benefits of using beam search during training. Greedy decoding during validations gives you a good estimate of the model performance for storing the model checkpoints. Later, you can always use the models with beam search during inference.
Thanks, yes beam search with attention works for batch size of 1. Do you plan to support larger batch sizes soon? That would be really useful.
Hi, Yes, we plan to support larger batch sizes. However, this will require a non-trivial refactoring of the current implementation of the attentions so I cannot tell when exactly this feature will be truly supported.
Hi, please also note that increasing batch size for inference likely won't bring any time improvements on CPUs. Moreover, as far as I know, other toolkits don't provide this feature at all..
Dne st 28. 3. 2018 5:45 dop. uživatel Dušan Variš [email protected] napsal:
Hi, Yes, we plan to support larger batch sizes. However, this will require a non-trivial refactoring of the current implementation of the attentions so I cannot tell when exactly this feature will be truly supported.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ufal/neuralmonkey/issues/679#issuecomment-376826989, or mute the thread https://github.com/notifications/unsubscribe-auth/ABwcs0cwhnUMLXKcRpV6PM-MdZxK4_R3ks5ti1uzgaJpZM4S3BiU .