factor_analyzer icon indicating copy to clipboard operation
factor_analyzer copied to clipboard

Loadings matrix has incorrect shape when using principal method with lapack

Open LuciaCam opened this issue 1 year ago • 1 comments

Bug Description When using the principal method with lapack SVD instead of randomized, the loadings matrix returned by FactorAnalyzer is always given in full, it has shape n_cols x n_cols, instead of selecting only loadings for the n_factors desired. When using the randomized SVD, there is no issue.

Reproducible Code

import pandas as pd
import numpy as np

num_rows = 1000
num_cols = 6
df = pd.DataFrame(
    np.random.standard_normal(size=(num_rows, num_cols)), 
    columns=[f'col{i+1}' for i in range(num_cols)])

# shape is correct with randomized
efa = FactorAnalyzer(n_factors=2, rotation='promax', method='principal', svd_method='randomized')
efa.fit(df)
print(efa.loadings_.shape)

# shape is incorrect with lapack
efa = FactorAnalyzer(n_factors=2, rotation='promax', method='principal', svd_method='lapack')
efa.fit(df)
print(efa.loadings_.shape)

Expected behavior The shape of the .loadings_ attribute should be n_cols x n_factors.

Versions (please complete the following information):

  • OS: Windows 10
  • Python: 3.10.10
  • Versions for factor_analyzer: 0.5.1 / numpy: 1.26.1 / scipy : 1.11.3 / pandas: 2.1.1

LuciaCam avatar Jun 17 '24 08:06 LuciaCam

Thanks for your feedback, @LuciaCam. I will look into this.

desilinguist avatar Jun 17 '24 15:06 desilinguist