scikit-learn-intelex
scikit-learn-intelex copied to clipboard
[PCA] nComponents doesn't work in distributed mode
oneAPI: 2021.3
Also check #758
Running pca_batch with defaultDense eigenvalues [[3.39562004 1.37148866 0.23289129]] eigenvectors [[ 0.3902121 -0.52491954 0.32817531 -0.47574208 0.48802094] [ 0.5068408 -0.09939021 -0.67516516 0.40501089 0.33667814] [-0.74898164 -0.46713755 -0.197722 0.16702423 0.3921963 ]]
Running pca_spmd with defaultDense eigenvalues [[ 3.39562004e+00 1.37148866e+00 2.32891294e-01 1.02094725e-16 -2.96383754e-16]] eigenvectors [[ 0.3902121 -0.52491954 0.32817531 -0.47574208 0.48802094] [ 0.5068408 -0.09939021 -0.67516516 0.40501089 0.33667814] [-0.74898164 -0.46713755 -0.197722 0.16702423 0.3921963 ] [ 0.11966106 -0.05948796 0.62631849 0.75265846 0.15288167] [ 0.12471829 -0.70201479 -0.07130314 0.12346577 -0.68650758]]
Reproducer: https://github.com/intel/scikit-learn-intelex/files/6904880/pca-dal-tests.tar.gz
nComponents is available for daal4py.pca class as batch parameter only.
nComponents isn't supported for distributed and online computation modes according to oneDAL documentation and implementation. Also, result tables have corresponding shape [1 or p] x p ([1 or nFeatures] x nFeatures) without nComponents.