S.Wang
S.Wang
Although I'm not familiar with Chainer too, I have found a reference of using GPUs in it: http://docs.chainer.org/en/stable/tutorial/gpu.html Hope that will help!
@0bserver07 Looks great! Thanks!
I don't think the variant of the LSTM is different from the original version except for the input X_t is designed to include memory readings :)
@Seraphli I think it is just the right expression for LSTM in a network with several hidden layers. The lth-layer cell gets input from its past (h_{t-1}^l), its lower layer...