scikit-learn-intelex
scikit-learn-intelex copied to clipboard
[enhancement] enable dpnp/dpctl testing in test_patching
Description
So far test_patching has been limited to cpu algorithms, this PR will enable get_dataframes_and_queues to test gpu support if available.
Changes proposed in this pull request:
- remove _numpy only in pytest.mark.parameterize("dataframe,queue")
Tasks
- [x] Remove numpy check
- [ ] Find failure points in public CI
- [ ] Find failure points in private CI
- [ ] Correct issues
- [ ] Pass public CI
- [ ] Pass private CI
625 test case failures in sklearn1.2 :(
423 fails remaining
/azp run CI
Azure Pipelines successfully started running 1 pipeline(s).
/intelci: run
/intelci: run
Looks like pairwise distances and kmeans on gpu have issues, ontop of the score method on classification (as shown on private CI).
8 remaining
/intelci: run
/azp run CI
/intelci: run
/intelci: run
/intelci: run
I am going to continue looking at why Kmeans is passing sklearn conformance testing on gpu, but otherwise this PR is ready to go
Currently investigating KMeans gpu passing private CI, I have forced an assert for gpu offloading in daal4py's _device_offload when the object is KMeans: https://github.com/intel/scikit-learn-intelex/compare/main...icfaust:scikit-learn-intelex:test/test_kmeans?expand=1 and I am running private CI against this repo: http://intel-ci.intel.com/eeeaa574-a9c3-f1c6-a2b8-a4bf010d0e2e
Currently investigating KMeans gpu passing private CI, I have forced an assert for gpu offloading in daal4py's _device_offload when the object is KMeans: https://github.com/intel/scikit-learn-intelex/compare/main...icfaust:scikit-learn-intelex:test/test_kmeans?expand=1 and I am running private CI against this repo: http://intel-ci.intel.com/eeeaa574-a9c3-f1c6-a2b8-a4bf010d0e2e
All GPU testing for kmeans is currently deactivated. This PR will continue that, but extra care will need to go into the KMeans out of preview because of this. If this PR is merged before KMeans out of preview, it will become a prerequisite to reintroduce this testing.
/intelci: run
CI looks close but some test_stacking fail - could be related to predict/_predict in rf classifier?
Looking very nice so far
/intelci: run
/intelci: run
/intelci: run
I have slightly modified CI to see where long duration tests exist in sklearnex with dpnp/dpctl support on linux. This will be reverted later.
Things to fix in the future:
LogisticRegression's decision_function is costing ~1 minute of CI time, but will run through. Why isn't this the case also for linear regression? All regression estimators score method, which uses r2 score and is likely not working well with array_api namespaces for sklearn < 1.2 (will run through but is slow).
https://dev.azure.com/daal/daal4py/_build/results?buildId=33073&view=logs&j=517fe804-fa30-5dc2-1413-330699242c05&t=517fe804-fa30-5dc2-1413-330699242c05
/intelci: run
/intelci: run
/intelci: run
/azp run CI
Azure Pipelines successfully started running 1 pipeline(s).
/intelci: run