yoyodyne icon indicating copy to clipboard operation
yoyodyne copied to clipboard

GRU support

Open kylebgorman opened this issue 4 months ago • 0 comments

This adds GRU support; everywhere there is an LSTM model, there is now a GRU model too.

I initially tried to make RNN type a general flag but because LSTMs return the cell state in addition to the hidden state, and because various models need to reshape, average, or otherwise manipulate that cell state, this was really not feasible. Thefore I just create, for each model that was previously "LSTM-backed", an abstract class called FooRNN{Encoder,Decoder,Model}. FooLSTM subclasses this and returns a LSTM module (it may also have special logic in the forward method, or decode method, or whatever), as does FooGRU.

I experimented with traditional Elman RNNs (they have the same simpler interface as GRUs) but performance was absymal so I'm not going to bother.

All models have been tested on CPU and GPU.

Other changes:

  • #251 is also implied here, but I separated it out for review.
  • The names got confusing so I also went ahead and replaced EncoderDecoder in our naming convention with simply just Model.

Closes #180. (Note however there's still plenty to do to study the effects this has.)

kylebgorman avatar Oct 08 '24 16:10 kylebgorman