factor_analyzer icon indicating copy to clipboard operation
factor_analyzer copied to clipboard

Mistake in correlation-function

Open pjuergens opened this issue 3 years ago • 0 comments

Describe the bug The function corr(x) doesn't return the correlation matrix. For the dataframe I am using with 40 rows/observations and 73 columns/variables (which unfortunately I cannot publish here) and no nan-values the diagonal elements of the computed correlation matrix are 0.975 instead of 1. Using the pandas df.corr() leads to the correct result.

It seems to me that the degrees of freedom in calculating the covariance matrix (r = cov(x)) is falsely set to 0 and should be 1 instead, i.e. r = cov(x, ddof=1).

By the way: In this case with more variables than observations calculate_kmo fails and returns nan which I guess makes sence. However returning a message in the underlaying partial_correlations-function would be very helpful.

pjuergens avatar Aug 05 '22 06:08 pjuergens