scikit-learn-intelex icon indicating copy to clipboard operation
scikit-learn-intelex copied to clipboard

[CI, enhancement] add sklearnex CI job with nightly oneDAL DPCPP build in github actions

Open icfaust opened this issue 9 months ago • 13 comments

Description

This PR expands sklearnex testing to work with latest oneDAL builds. It uses a custom Visual Studio / Intel DPC++ compiler (AVX2) windows build for Windows testing. It uses a Intel DPC++ Linux build (AVX2). This uses only publicly available components (which would match external contributor capabilities), and follows what an external contributor would have to do to hand-build oneDAL/sklearnex.

It has:

  1. a nightly build version of oneDAL/main for testing without waiting for release
  2. additional DPCtl/dpnp windows testing
  3. a venv/pip build strategy for expanding robustness in our build processes
  4. building onedal with gcc in linux and vc in windows (use as much out-of-the-box as possible)
  5. more free runners from a different source, avoiding runner availability issues on Azure DevOps
  6. builds without dpc backend for testing robustness of the non-SYCLqueue/dummySYCLqueue backend
  7. preparations for integrations for public CI coverage checking
  8. fixes to yet-unseen test case failures (which will occur in 2024.6, uncovered in the new CI)
  9. fixes to conda-specific behaviors which occur in onedal.__init__ and daal4py.__init__
  10. fixes generalization problems in various CI scripts (from being conda-focused)

Regressions:

  • Additional tests had to be deselected due to the visual studio build (6 LocalOutlierFactor tests)
  • pytest downgrade for python 3.10 for windows

Note this PR removes github runner introduced in #1844 because the github.token secret has sufficient permissions (read) able to access artifacts from oneDAL (and other open-source github repos).

Current Setup tests the following for Windows and Linux (matching current CI for verification):

Python Sklearn DPCTL
3.9 1.1 yes
3.10 1.2 no
3.11 1.3 yes

Further development would be required to change these due to the fragility of some of the ci setup scripts and related pandas/scipy/numpy versions (e.g. some work will be required after new values are determined).

icfaust avatar May 24 '24 11:05 icfaust