recurrentjs
recurrentjs copied to clipboard
Training LSTM for classification
I'm having trouble understanding this part of the example code:
// for example lets assume we have binary classification problem
// so the output of the LSTM are the log probabilities of the
// two classes. Lets first get the probabilities:
var prob1 = R.softmax(out1.o);
var target1 = 0; // suppose first input has class 0
cost += -Math.log(probs.w[ix_target]); // softmax cost function
// cross-entropy loss for softmax is simply the probabilities:
out1.dw = prob1.w;
// but the correct class gets an extra -1:
out1.dw[ix_target] -= 1;
Especially what's going on with the cost
variable – it's not declared nor used anywhere. Also I don't understand why out1
is used for training – shouldn't it be the last output, out3
?
I'm trying to solve the similar problem – feed a sequence of input vectors to the model, and then retrieve a single output vector. But I am unsure how to correctly train the LSTM in this case.