Grokking-Deep-Learning icon indicating copy to clipboard operation
Grokking-Deep-Learning copied to clipboard

Chapter 5: weight_deltas calculation in case of multiple inputs and multiple outputs

Open dimchansky opened this issue 5 years ago • 3 comments

weight_deltas are calculated in this way:

[ [input[0] * delta[0], input[0] * delta[1], input[0] * delta[2]],
  [input[1] * delta[0], input[1] * delta[1], input[1] * delta[2]],
  [input[2] * delta[0], input[2] * delta[1], input[2] * delta[2]] ]

but should be transposed:

[ [input[0] * delta[0], input[1] * delta[0], input[2] * delta[0]],
  [input[0] * delta[1], input[1] * delta[1], input[2] * delta[1]],
  [input[0] * delta[2], input[1] * delta[2], input[2] * delta[2]] ]

otherwise weights are updated incorrectly.

Current code:

import numpy as np
def outer_prod(a, b):
    
    # just a matrix of zeros
    out = np.zeros((len(a), len(b)))

    for i in range(len(a)):
        for j in range(len(b)):
            out[i][j] = a[i] * b[j]
    return out

weight_deltas = outer_prod(input,delta)

PR should fix the issue: #22

dimchansky avatar Aug 31 '19 22:08 dimchansky

@iamtrask but in fact, the code proposed in the book seems difficult to understand. I think it would be much clearer to write it just like this:

weight_deltas = [ele_mul(d,input) for d in delta]

dimchansky avatar Aug 31 '19 22:08 dimchansky

I second this. The weight_delta matrix should have the same number of columns as the length of the "input" vector. It should be

weight_deltas = outer_prod(delta,input).

PeterGanZW avatar Feb 08 '20 08:02 PeterGanZW

I second this. The weight_delta matrix should have the same number of columns as the length of the "input" vector. It should be

weight_deltas = outer_prod(delta,input).

@PeterGanZW Yep, I proposed the same changes in PR: #22

dimchansky avatar Feb 11 '20 17:02 dimchansky