scikit-learn-intelex
scikit-learn-intelex copied to clipboard
'AttributeError: 'PCA' object has no attribute 'n_oversamples'
Describe the bug
Attempting to run the code below results in an error when sklearnex
is combined with PCA
. This line produces the error shown below
python -m sklearnex PCA_test.py
this line does not
python PCA_test.py
To Reproduce
Store this code in a PCA_test.py
file and call using the commands above
import numpy as np
from sklearn.decomposition import PCA
data = np.random.uniform(-10, 10, (1000, 3))
pca = PCA(n_components=3)
data_pca = pca.fit(data).transform(data)
Expected behavior No error
Output/Screenshots
Intel(R) Extension for Scikit-learn* enabled (https://github.com/intel/scikit-learn-intelex)
Traceback (most recent call last):
File "/home/gperren/miniconda3/envs/pyupmask/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/gperren/miniconda3/envs/pyupmask/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/gperren/miniconda3/envs/pyupmask/lib/python3.8/site-packages/sklearnex/__main__.py", line 55, in <module>
sys.exit(_main())
File "/home/gperren/miniconda3/envs/pyupmask/lib/python3.8/site-packages/sklearnex/__main__.py", line 52, in _main
runf(args.name, run_name='__main__')
File "/home/gperren/miniconda3/envs/pyupmask/lib/python3.8/runpy.py", line 265, in run_path
return _run_module_code(code, init_globals, run_name,
File "/home/gperren/miniconda3/envs/pyupmask/lib/python3.8/runpy.py", line 97, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/home/gperren/miniconda3/envs/pyupmask/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "PCA_test.py", line 9, in <module>
data_pca = pca.fit(data).transform(data)
File "/home/gperren/miniconda3/envs/pyupmask/lib/python3.8/site-packages/sklearn/decomposition/_pca.py", line 402, in fit
self.n_oversamples,
AttributeError: 'PCA' object has no attribute 'n_oversamples'
Environment:
System:
python: 3.8.13 (default, Mar 28 2022, 11:38:47) [GCC 7.5.0]
executable: /home/gperren/miniconda3/envs/pyupmask/bin/python
machine: Linux-5.0.16-100.fc28.x86_64-x86_64-with-glibc2.17
Python dependencies:
sklearn: 1.1.1
pip: 21.2.4
setuptools: 61.2.0
numpy: 1.22.3
scipy: 1.7.3
Cython: None
pandas: None
matplotlib: None
joblib: 1.1.0
threadpoolctl: 2.2.0
Built with OpenMP: True
threadpoolctl info:
filepath: /home/gperren/miniconda3/envs/pyupmask/lib/python3.8/site-packages/scikit_learn.libs/libgomp-a34b3233.so.1.0.0
prefix: libgomp
user_api: openmp
internal_api: openmp
version: None
num_threads: 48
filepath: /home/gperren/miniconda3/envs/pyupmask/lib/libmkl_rt.so.1
prefix: libmkl_rt
user_api: blas
internal_api: mkl
version: 2021.4-Product
num_threads: 24
threading_layer: intel
filepath: /home/gperren/miniconda3/envs/pyupmask/lib/libiomp5.so
prefix: libiomp
user_api: openmp
internal_api: openmp
version: None
num_threads: 48
Seems only reproduce at MacOS
------------------------------- Captured stdout --------------------------------
Command '['/usr/local/miniconda/envs/CB/bin/python', '/Users/runner/work/1/s/daal4py/sklearn/monkeypatch/tests/utils/_launch_algorithms.py']' returned non-zero exit status 1.
------------------------------- Captured stderr --------------------------------
dispatcher.py:151: FutureWarning:
Scikit-learn patching with daal4py is deprecated and will be removed in the future.
Use Intel(R) Extension for Scikit-learn* module instead (pip install scikit-learn-intelex).
To enable patching, please use one of the following options:
1) From the command line:
python -m sklearnex <your_script>
2) From your script:
from sklearnex import patch_sklearn
patch_sklearn()
Intel(R) oneAPI Data Analytics Library solvers for sklearn enabled: https://intelpython.github.io/daal4py/sklearn.html
Traceback (most recent call last):
File "/Users/runner/work/1/s/daal4py/sklearn/monkeypatch/tests/utils/_launch_algorithms.py", line 117, in <module>
run_algotithms()
File "/Users/runner/work/1/s/daal4py/sklearn/monkeypatch/tests/utils/_launch_algorithms.py", line 93, in run_algotithms
run_patch(info, t)
File "/Users/runner/work/1/s/daal4py/sklearn/monkeypatch/tests/utils/_launch_algorithms.py", line 61, in run_patch
model.fit(X, y)
File "/usr/local/miniconda/envs/CB/lib/python3.9/site-packages/sklearn/decomposition/_pca.py", line 402, in fit
self.n_oversamples,
AttributeError: 'PCA' object has no attribute 'n_oversamples'
=========================== short test summary info ============================
ERROR s/daal4py/sklearn/monkeypatch/tests/test_patching.py - SystemExit: 1
!!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!
Hi @Gabriel-p @FavorMylikes Thank your for your detailed reports. It seems due to new sklearn version 1.1.1, where we have some new parameters for PCA such as n_oversamples
. So it is sklearex bug.
Running into the same issue, how do we solve it?
Running into the same issue, how do we solve it?
@mjoy296 Downgrade sklearn
to 1.0.2
@mjoy296, I'm running into the same issue as well.
@FavorMylikes, I appreciate your solution but unfortunately I can't downgrade sklearn
to 1.0.2
as I have dependencies on 1.1
.
As seen in latest sklearn (1.1.2) docs there are new PCA()
parameters since 1.1
:
n_oversamples : int, default=10
power_iteration_normalizer : {‘auto’, ‘QR’, ‘LU’, ‘none’}, default=’auto’
Here are evidences:
>>> from sklearnex import patch_sklearn
>>> patch_sklearn()
Intel(R) Extension for Scikit-learn* enabled (https://github.com/intel/scikit-learn-intelex)
>>> from sklearn.decomposition import PCA
>>> p = PCA()
>>> p.get_params()
{'copy': True, 'iterated_power': 'auto', 'n_components': None, 'random_state': None, 'svd_solver': 'auto', 'tol': 0.0, 'whiten': False}
>>> from sklearnex import unpatch_sklearn
>>> unpatch_sklearn()
>>> from sklearn.decomposition import PCA
>>> p = PCA()
>>> p.get_params()
{'copy': True, 'iterated_power': 'auto', 'n_components': None, 'n_oversamples': 10, 'power_iteration_normalizer': 'auto', 'random_state': None, 'svd_solver': 'auto', 'tol': 0.0, 'whiten': False}
I wonder if sklearnex
could support new default PCA()
parameters "from the future" or at least ignore their existence, otherwise I'd prefer to sklearnex.unpatch_sklearn()
just for PCA()
as its performance seems acceptable without sklearnex
for now.
Eventually I would seek for more PCA performance by other means like using GPUs with the PCA from RAPIDS/cuml.