[QUESTION] What is the correct form of Equation 4-13 in CH. 4 Logistic Regression?
It's not clear why the theta in this formula is transposed.
I believe it should've been the X that gets to be transposed. So this is the correct equation.
For two vectors $a$ and $b$, we have $a^Tb = b^T a$, so the two expressions in your post are identical.
Thanks for your question @siavashr99. As @LittleCottage said, the dot product of two vectors is commutative: a · b = b · a. By the way, in Machine Learning, the dot product of two vectors a and b is often denoted a⊺b. Mathematically speaking, that's not quite right: a and b would have to be column vectors (i.e., matrices with a single column), and the result would be a 1 ⨉ 1 matrix with a single item equal to a · b. But using this notation has the advantage that we can use the same notation whether we're talking about vectors or matrices (there's a note near the start of Chapter 4 about this).
Now if you're talking about a matrix X rather than a vector x, then things are different, the order matters. Suppose X in an m ⨉ d matrix (there are m samples, and each of them is d-dimensional), and the weight matrix Θ is n ⨉ d (where n is the output dimensionality), then we usually compute: XΘ⊺ since this will give us an output matrix of shape m ⨉ n: that's usually the shape we want, with one row per sample (just like X has one row per sample).
In many cases, to avoid the transpose step, the weight matrix is directly represented as a d ⨉ n matrix (instead of n ⨉ d) so there's no need to transpose it: we just compute XΘ, no transpose.
Hope this helps.