dPCA icon indicating copy to clipboard operation
dPCA copied to clipboard

the sum of explained_variance_ratio_ is not 1 in python

Open SabriQ opened this issue 4 years ago • 4 comments

Hi. Just to make sure whether it's normal to be much less than 1 (about 0.5) when having all the explained_variance_ratio_ added up in dPCA (python)? if not, where should I make a mistake?

SabriQ avatar Jan 17 '21 09:01 SabriQ

Sure, if the number of components is much smaller than the data dimension, then you cannot explain all the variance in your data. Same as in PCA.

wielandbrendel avatar Jan 25 '21 20:01 wielandbrendel

sorry to disturb you again.2questions. one is will the nan in matrix which I replaced with 0 could lead to much lower explained_variance_ratio_? I found the dPCA explained_variance_ratio of the first 10 components are much smaller that that in PCA, which is about ~1% to ~70% in my case. I'm wondering whether it's caused by the way the replacement of nan value. the another one is will the code calculate the explain ed_variance_ratio in MATLAB and Python is different? they organize the matrix in different dimension orders.

SabriQ avatar Mar 19 '21 18:03 SabriQ

Its might be the big difference between dimensions.

SabriQ avatar Mar 27 '21 11:03 SabriQ

I think the explained variance ratio python code is wrong.

Looking at the python code, it only uses the variance of the projection of the original data onto each decoder dimension to calculate the explained variance ratio. While this would technically work for PCA since the encoder and decoder matrices are the same and each vector has 2-norm of 1, here that's not true, so it doesn't work.

I think the original matlab code had it right, where you have to reproject using the encoding matrix, then calculate the variance explained.

sean-metzger avatar Aug 17 '21 22:08 sean-metzger