`n_jobs` support details in docs
Description
Adds a doc page for n_jobs specifics of sklearnex.
Checklist to comply with before moving PR from draft:
PR completeness and readability
- [x] I have reviewed my changes thoroughly before submitting this pull request.
- [x] I have commented my code, particularly in hard-to-understand areas.
- [x] I have updated the documentation to reflect the changes or created a separate PR with update and provided its number in the description, if necessary.
- [x] Git commit message contains an appropriate signed-off-by string (see CONTRIBUTING.md for details).
- [x] I have added a respective label(s) to PR if I have a permission for that.
- [x] I have resolved any merge conflicts that might occur with the base branch.
Testing
- [x] I have run it locally and tested the changes extensively.
- [x] All CI jobs are green or I have provided justification why they aren't.
- [x] I have extended testing suite if new functionality was introduced in this PR.
Performance
N/A
Codecov Report
:white_check_mark: All modified and coverable lines are covered by tests.
| Flag | Coverage Δ | |
|---|---|---|
| azure | ? |
|
| github | 71.96% <ø> (ø) |
Flags with carried forward coverage won't be shown. Click here to find out more. see 41 files with indirect coverage changes
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
Thanks for adding these explanations. But it's still missing important pieces of information and leaves several questions unanswered:
- It's missing the threading part of MKL, the static linkage part, and how it interacts with environment variables,
n_jobsparameter,inner_max_num_threadsparameter, and threadpoolctl configurations. - It doesn't mention how the threading works when put under a threadpoolctl context.
- The explanation is unclear about what ends up happening with the number of threads when using environment variables in addition to passing
n_jobsas parameter. - Could mention what happens with
n_jobsin GPU mode. - There's a difference in the threading configuration logic between daal4py and sklearnex, which this doc could also mention.
- It doesn't cover the part about some configurations being global, which is quite relevant when using python-based multi-threading.
- It could mention that the TBB threading doesn't automatically avoid nested parallelism when used in conjunction with OpenMP (which sklearn uses) and/or with joblib or python threads.
- Some estimators perform better when not using all threads - for example, linear regression is faster on LNL laptops when not using low-power E-cores. Perhaps could mention these sort of things here as they are relevant.
@Alexsandruss make sure to merge main for latest CI checks on docs
Closing in favor of https://github.com/uxlfoundation/scikit-learn-intelex/pull/2768