python-machine-learning-book-2nd-edition icon indicating copy to clipboard operation
python-machine-learning-book-2nd-edition copied to clipboard

Chapter 15 p498 typo?

Open tlin40 opened this issue 4 years ago • 3 comments

I'm confused by the practical formula for computing a discrete convolution after padding on p498. Shouldn't the index of x^{p} be [i+p-k] instead of [i+m-k]? So that x^{p}[i+p] and the original x[i] will have the same value.

tlin40 avatar Mar 01 '20 12:03 tlin40

So I think you are talking about this equation: image

The equation is correct. $x^p$ is the padded input vector, with $p$ zeros on each side of the vector. $k$ is the index of the summation. Let's assume n=8, m=3, and p=1, then x_p will have size 10, and the output elements are as follows:

  • y[0] => np.sum(x_p[:3:-1] * w[:])
  • y[1] => np.sum(x_p[1:4:-1] * w[:])
  • y[2] => np.sum(x_p[2:5:-1] * w[:])
  • ...

so even if you change the padding, for example p=5; x_p will be padded with 5 zeros on left and ride, but the indexes in the above examples will stay the same:

  • y[0] => np.sum(x_p[:3:-1] * w[:])
  • y[1] => np.sum(x_p[1:4:-1] * w[:])
  • y[2] => np.sum(x_p[2:5:-1] * w[:])
  • ...

The only difference is that x_p here has size n+2*p=18 (instead of 10).

vmirly avatar Mar 03 '20 05:03 vmirly

Thank you for the quick response! Somehow in my mind I just ignored zeros padded on left of x_p and started from x_p[0+p] (the first element of non-zero x_p).

There's another thing I've noticed: if y[i], x_p[i], w[i] are all meant to be zero-indexed vectors, x_p[i+m-k] probably should be x_p[i+m-1-k], so that when y[i=0] and k=m-1, x_p[0] is the leftmost element to be used instead of x_p[1].

tlin40 avatar Mar 03 '20 15:03 tlin40

Yes you are right, the current formula gives the following for i=0:

i=0 :: y[0] = x_p[0+3-0] . w[0]  +  x_p[0+3-1] . w[1]   +   x_p[0+3-2] . w[2]
            = x_p[3] . w[0].     +   x_p[2] . w[1]      +  x_p[1] . w[2] 

which is not correct, and instead should have been:

i=0 :: y[0] = x_p[0+3-1-0] . w[0]  +  x_p[0+3-1-1] . w[1]   +   x_p[0+3-1-2] . w[2]
            = x_p[2] . w[0]        +   x_p[1] . w[1]        +    x_p[0] . w[2] 

Thanks for bringing this to our attention. I did not realize this issue

vmirly avatar Mar 04 '20 04:03 vmirly