recurrentjs icon indicating copy to clipboard operation
recurrentjs copied to clipboard

Training LSTM for classification

Open denull opened this issue 8 years ago • 0 comments

I'm having trouble understanding this part of the example code:

// for example lets assume we have binary classification problem
// so the output of the LSTM are the log probabilities of the
// two classes. Lets first get the probabilities:
var prob1 = R.softmax(out1.o);
var target1 = 0; // suppose first input has class 0
cost += -Math.log(probs.w[ix_target]); // softmax cost function

// cross-entropy loss for softmax is simply the probabilities:
out1.dw = prob1.w;
// but the correct class gets an extra -1:
out1.dw[ix_target] -= 1;

Especially what's going on with the cost variable – it's not declared nor used anywhere. Also I don't understand why out1 is used for training – shouldn't it be the last output, out3?

I'm trying to solve the similar problem – feed a sequence of input vectors to the model, and then retrieve a single output vector. But I am unsure how to correctly train the LSTM in this case.

denull avatar Apr 08 '16 14:04 denull