handson-ml3 icon indicating copy to clipboard operation
handson-ml3 copied to clipboard

[QUESTION] What is the correct form of Equation 4-13 in CH. 4 Logistic Regression?

Open siavashr99 opened this issue 1 year ago • 1 comments

It's not clear why the theta in this formula is transposed. Screenshot from 2024-07-21 10-16-13

I believe it should've been the X that gets to be transposed. So this is the correct equation. Screenshot from 2024-07-21 10-18-39

siavashr99 avatar Jul 21 '24 06:07 siavashr99

For two vectors $a$ and $b$, we have $a^Tb = b^T a$, so the two expressions in your post are identical.

LittleCottage avatar Nov 23 '24 17:11 LittleCottage

Thanks for your question @siavashr99. As @LittleCottage said, the dot product of two vectors is commutative: a · b = b · a. By the way, in Machine Learning, the dot product of two vectors a and b is often denoted ab. Mathematically speaking, that's not quite right: a and b would have to be column vectors (i.e., matrices with a single column), and the result would be a 1 ⨉ 1 matrix with a single item equal to a · b. But using this notation has the advantage that we can use the same notation whether we're talking about vectors or matrices (there's a note near the start of Chapter 4 about this).

Now if you're talking about a matrix X rather than a vector x, then things are different, the order matters. Suppose X in an md matrix (there are m samples, and each of them is d-dimensional), and the weight matrix Θ is nd (where n is the output dimensionality), then we usually compute: since this will give us an output matrix of shape mn: that's usually the shape we want, with one row per sample (just like X has one row per sample).

In many cases, to avoid the transpose step, the weight matrix is directly represented as a dn matrix (instead of nd) so there's no need to transpose it: we just compute , no transpose.

Hope this helps.

ageron avatar Oct 14 '25 21:10 ageron