lecture-python.myst PCA description in SVD lecture

PCA description in SVD lecture

Open sidd3888 opened this issue 7 months ago • 3 comments

Hey QuantEcon folks!

I was going through the SVD lecture (lecture number 5) and came across the section on PCA (5.8). Having re-read the section a couple of times, I see that the description of the whole process is a bit messy. The data is presented as $X$ an $m \times n$ matrix, with $m$ variables and $n$ individuals.

First, I believe that the text seeks to describe computing averages by variable and not individual, which is what the notation describes instead. The averages are computed and the average matrix $\bar{X}$ is written as a column vector of ones multiplied by $[\bar{X}_1 \cdots \bar{X}_n]$.

Thereafter, the section on decomposing the covariance matrix uses a $B^TB$ (instead of $BB^T$) operation, which I believe would result in an $n \times n$ matrix, as opposed to the desired $m \times m$. Furthermore, the description of the decomposition includes a section on the covariance matrix $C$ potentially not being diagonalizable, though it must be positive.

There might also be a typo in the last score matrix $T$.

(The covariance operation also uses $\frac{1}{n}$ instead of the sample version of $\frac{1}{n-1}$, but I was not sure of the intent there, so didn't know which it was meant to be)

Jun 29 '24 21:06 sidd3888

lecture-python.myst lecture-python.myst copied to clipboard

PCA description in SVD lecture

lecture-python.myst
lecture-python.myst copied to clipboard