Toy-Neural-Network-JS icon indicating copy to clipboard operation
Toy-Neural-Network-JS copied to clipboard

Adjust at the end

Open FauconFan opened this issue 7 years ago • 6 comments

Hi,

I don't know if it was already said, but it is still not fixed in the code. When you are training your model, you calculate the weights_deltas between hidden layer and output layer. But you are updating your weights too early, you are using the new weights to calculate deltas for previous weights (for back-propagation). You have to keep deltas in memory then update at the end after the rest.

That's why your model takes a lot of time to be trained for the XOR problem. It shouldn't be that long. By memory you have took 50000 iterations. That is too big for a problem like this.

Keep going ^^.

P.S.: Sorry if my english is not perfect, it is not my main natural language.

FauconFan avatar Feb 14 '18 10:02 FauconFan

I wonder if gradients should be updated in the same way?

Sorry, I don't know the algorithm well

xxMrPHDxx avatar Feb 14 '18 20:02 xxMrPHDxx

I don't think it matters, or i didn't understand your comment... sorry ^^

For the algorithm. order is important, because if your weights is updated too early the following steps for back propagation is disturbed. For the example of a 3 layers neural networks. It doesn't really matter, but for a deeper nn, this is very important...

:)

FauconFan avatar Feb 14 '18 20:02 FauconFan

Based on the way you mention earlier, I assume that the calculation is done as follows

deltas = []
for layer of layers:
    delta = calculateDelta()
    deltas.append(delta)

    calculateGradient()
    bias.addGradient()

for weight in weight
    weight.add(delta)

Am I right?

xxMrPHDxx avatar Feb 14 '18 21:02 xxMrPHDxx

Yes, but bias must be treated in the same way that weights, for the same reason.

And you suppose that gradient and deltas are different but delta are calculated using gradient. (just in case that was not clear)

So now :

deltas_w = []
deltas_bias = []
for layer of layers:
    gradient = calculateGradien()
    delta = calculateDelta(gradient)
    deltas_w.append(delta)
    deltas_bias.append(gradient) // According to the model

// Calculated with learning rate and multiplied by -1 for the descent part
for weight in weight
    weight.add(delta)
for b in deltas_bias
    bias.add(b)

This algorithm may be wrong, and may only work for a convolutional neural network.

Hope it helps

FauconFan avatar Feb 14 '18 21:02 FauconFan

but bias must be treated in the same way

That's what I meant earlier. Thanks for the info

xxMrPHDxx avatar Feb 14 '18 21:02 xxMrPHDxx

Your welcome

FauconFan avatar Feb 14 '18 21:02 FauconFan