faster-rnnlm icon indicating copy to clipboard operation
faster-rnnlm copied to clipboard

Training with several hidden layers

Open VeliBaba opened this issue 9 years ago • 8 comments

Hi! I have some questions about faster-rnnlm. There it is possible to use several hidden layers during training. My questions are:

  1. Which of them is used for recurrent part?
  2. Does it use those hidden layers during decoding or computing entropy? Thanks!

VeliBaba avatar Aug 05 '15 05:08 VeliBaba

Hi!

  1. All of them. The output of one layer is the input for the next one. For instance, if you have two tanh layers then the network looks like this: h1_t = tanh(x_t + W1 * h1_{t - 1}) h2_t = tanh(U * h1_t + W2 * h2_{t - 1})
  2. Yes, it does.

akhti avatar Aug 05 '15 09:08 akhti

Ok. Is the ouput of the last hidden layer used at the input of the next neural network?

VeliBaba avatar Aug 05 '15 09:08 VeliBaba

What is 'next neural network'? If you mean next timestamp (next word), then the answer is yes.

akhti avatar Aug 05 '15 09:08 akhti

Yes, I mean this. Ok, thanks

VeliBaba avatar Aug 05 '15 09:08 VeliBaba

Is it good at performance using several hidden layers instead of a single hidden layer? Which is better: use a single hidden layer with size 400, or 4 hidden layers with size 100?

VeliBaba avatar Aug 05 '15 10:08 VeliBaba

First, when you increase layer in 4 times, training/evaluation time (in theory) is increased in 16 times (4 squared). So it's more resonable to compare 1 layer of size 400 with 4 layers of size 200. However, I would recomment to train a shallow network with a single first.

akhti avatar Aug 05 '15 10:08 akhti

Hi! I have two different toolkits for training of the rnnlm: the first one is rnnlm-hs-0.1b (Ilya-multithreading), and the second one is faster-rnnlm.The faster-rnnlm is faster than rnnlm-hs-0.1b about 3 times with the same options. Is it expectable that valid entropy at the end of training may be worse with faster-rnnlm than rnnlm-hs-0.1b?

VeliBaba avatar Aug 06 '15 05:08 VeliBaba

It's expected that the entropy will be more or less the same.

akhti avatar Aug 06 '15 11:08 akhti