CTranslate2
CTranslate2 copied to clipboard
Support for RNN based decoder units
Are there any plans to support inference of heterogeneous encoder-decoder architectures where in we use transformer based encoder and RNN/LSTM based decoders ?
Would like to submit this as a new feature request.
Currently there are no plans to support RNN-based decoders.
What is the framework you are using to train these models?
Ah ok !..We would be using Fairseq for the student model. For faster inference we wanted to check if we can convert to CTranslate2 with the VMAP support.
I would also be happy to contribute but I am not sure where to start.
Why not use a full Transformer model for the student? The model would be directly compatible with CTranslate2.
HI @guillaumekln , so currently we are using Transformer for both encoders and decoders. We want to go with hybrid, tranformer(enc)-rnn(dec) based networks to further reduce the inference latency and increase the throughput.
HI @guillaumekln , so currently we are using Transformer for both encoders and decoders. We want to go with hybrid, tranformer(enc)-rnn(dec) based networks to further reduce the inference latency and increase the throughput.
@harishankar-gopalan Have you found such code about tranformer(enc)-rnn(dec) based networks? Currently transformer decoding speed is a bit slow, I also need such a codec framework
Hi @Andrewlesson no. We intend to go with custom Fairseq model most probably where we define a custom architecture in Fairseq. If that doesn't work our we would have to go with vanilla PyTorch.
Did you also consider training a Transformer model with a reduced number of decoder layers? CTranslate2 can run models with a different number of encoder and decoder layers.
Yea we are already using a deep encoder shallow decoder architecture. We want to experiment the performance of such architectures where the encoder and decoder stack have separate architectures of their own.
We are training models with Marian (and using the Bergamot fork for quantization) for the Firefox Translations feature. The decoder is a RNN (see https://aclanthology.org/2020.ngt-1.26/). It would be nice to see what performance would look like with CTranslate2 (we are already planning to use it to speed up translations with the teacher models: https://github.com/mozilla/firefox-translations-training/issues/165).