lecture-python.myst
lecture-python.myst copied to clipboard
PCA description in SVD lecture
Hey QuantEcon folks!
I was going through the SVD lecture (lecture number 5) and came across the section on PCA (5.8). Having re-read the section a couple of times, I see that the description of the whole process is a bit messy. The data is presented as $X$ an $m \times n$ matrix, with $m$ variables and $n$ individuals.
First, I believe that the text seeks to describe computing averages by variable and not individual, which is what the notation describes instead. The averages are computed and the average matrix $\bar{X}$ is written as a column vector of ones multiplied by $[\bar{X}_1 \cdots \bar{X}_n]$.
Thereafter, the section on decomposing the covariance matrix uses a $B^TB$ (instead of $BB^T$) operation, which I believe would result in an $n \times n$ matrix, as opposed to the desired $m \times m$. Furthermore, the description of the decomposition includes a section on the covariance matrix $C$ potentially not being diagonalizable, though it must be positive.
There might also be a typo in the last score matrix $T$.
(The covariance operation also uses $\frac{1}{n}$ instead of the sample version of $\frac{1}{n-1}$, but I was not sure of the intent there, so didn't know which it was meant to be)