yoyodyne
yoyodyne copied to clipboard
GRU support
This adds GRU support; everywhere there is an LSTM model, there is now a GRU model too.
I initially tried to make RNN type a general flag but because LSTMs return the cell state in addition to the hidden state, and because various models need to reshape, average, or otherwise manipulate that cell state, this was really not feasible. Thefore I just create, for each model that was previously "LSTM-backed", an abstract class called FooRNN{Encoder,Decoder,Model}
. FooLSTM
subclasses this and returns a LSTM module (it may also have special logic in the forward method, or decode method, or whatever), as does FooGRU
.
I experimented with traditional Elman RNNs (they have the same simpler interface as GRUs) but performance was absymal so I'm not going to bother.
All models have been tested on CPU and GPU.
Other changes:
- #251 is also implied here, but I separated it out for review.
- The names got confusing so I also went ahead and replaced
EncoderDecoder
in our naming convention with simply justModel
.
Closes #180. (Note however there's still plenty to do to study the effects this has.)