oneDAL icon indicating copy to clipboard operation
oneDAL copied to clipboard

More granular CPU features detection

Open Vika-F opened this issue 7 months ago • 10 comments

Description

API for run-time CPU features detection was added to DAAL and oneDAL. Following features were included into initial list:

  • SpeedStep
  • Turbo Boost
  • bfloat16
  • AVX512 VNNI
  • Turbo Boost Max 3.0

Checklist to comply with before moving PR from draft:

PR completeness and readability

  • [x] I have reviewed my changes thoroughly before submitting this pull request.
  • [x] I have commented my code, particularly in hard-to-understand areas.
  • [x] Git commit message contains an appropriate signed-off-by string (see CONTRIBUTING.md for details).
  • [x] I have added a respective label(s) to PR if I have a permission for that.
  • [x] I have resolved any merge conflicts that might occur with the base branch.

Testing

  • [x] I have run it locally and tested the changes extensively.
  • [x] All CI jobs are green or I have provided justification why they aren't.

Performance

not applicable

Vika-F avatar Apr 14 '25 13:04 Vika-F

@Alexandr-Solovev I can do it yes, just to make this info printable on demand via cpu_info. But the main goal of this PR is to allow following checks in the code of DAAL algorithms:

if (__daal_serv_cpu_feature_detect() & daal::CpuFeature::tb3)
{
     /// Do something if Turbo Boost Max 3.0 Technology is present on the machine
}
else
{
    /// Do something else if Turbo Boost Max 3.0 Technology is _not_ present on the machine
}

The modifications in cpu_info are needed just to be able to somehow validate that the checks are working actually.

Vika-F avatar Apr 17 '25 12:04 Vika-F

General question: Is it possible to add these specs in this info: https://github.com/uxlfoundation/oneDAL/blob/main/cpp%2Fdaal%2Fsrc%2Fservices%2Flibrary_version_info.cpp#L53 ? It could extend future profiler abilities

My previous comment was premature. Unfortunately, I cannot add this info directly into the LibraryVersionInfo struct as it breaks the ABI.

But the information about CPU features can be obtained or printed using oneDAL API as it is shown in this test: https://github.com/uxlfoundation/oneDAL/blob/main/cpp/oneapi/dal/test/global_context.cpp#L135

Vika-F avatar Apr 23 '25 08:04 Vika-F

/intelci: run

Vika-F avatar Apr 23 '25 12:04 Vika-F

/intelci: run

Vika-F avatar Apr 24 '25 13:04 Vika-F

Sorry for off topic - but things are now get to correct config and @rakshithgb-fujitsu @keeranroth you now proper approvers. image

Would work on creating PR for codeowners updates so you would be automatically assigned as relievers for some of the aspects

napetrov avatar Apr 24 '25 15:04 napetrov

Re-run CI: http://intel-ci.intel.com/f021d3db-fe7e-f156-a9a2-a4bf010d0e2d

Vika-F avatar Apr 25 '25 12:04 Vika-F

/intelci: run

Vika-F avatar May 05 '25 07:05 Vika-F

/intelci: run

Vika-F avatar May 08 '25 12:05 Vika-F

/intelci: run

Vika-F avatar May 20 '25 10:05 Vika-F

/intelci: run

Vika-F avatar Jun 27 '25 08:06 Vika-F

/intelci: run

Vika-F avatar Jul 07 '25 13:07 Vika-F

/intelci: run

Vika-F avatar Jul 07 '25 15:07 Vika-F

@Vika-F The latest commits do not solve the segfault issue, but interestingly, they do change the error message:

==3295017== Jump to the invalid address stated on the next line
==3295017==    at 0x0: ???
==3295017==    by 0x451B9BF3: __cxx_global_var_init.1 (cpu.hpp:77)
==3295017==    by 0x451B9EEE: _GLOBAL__sub_I_homogen.cpp (homogen.cpp:0)
==3295017==    by 0x46E28B9: call_init.part.0 (in /usr/lib64/ld-2.28.so)
==3295017==    by 0x46E29B9: _dl_init (in /usr/lib64/ld-2.28.so)
==3295017==    by 0x5BE427B: _dl_catch_exception (in /usr/lib64/libc-2.28.so)
==3295017==    by 0x46E6E8D: dl_open_worker (in /usr/lib64/ld-2.28.so)
==3295017==    by 0x5BE4223: _dl_catch_exception (in /usr/lib64/libc-2.28.so)
==3295017==    by 0x46E66B0: _dl_open (in /usr/lib64/ld-2.28.so)
==3295017==    by 0x53231E9: dlopen_doit (in /usr/lib64/libdl-2.28.so)
==3295017==    by 0x5BE4223: _dl_catch_exception (in /usr/lib64/libc-2.28.so)
==3295017==    by 0x5BE42E2: _dl_catch_error (in /usr/lib64/libc-2.28.so)
==3295017==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==3295017== 
==3295017== 
==3295017== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==3295017==  Bad permissions for mapped region at address 0x0
==3295017==    at 0x0: ???
==3295017==    by 0x451B9BF3: __cxx_global_var_init.1 (cpu.hpp:77)
==3295017==    by 0x451B9EEE: _GLOBAL__sub_I_homogen.cpp (homogen.cpp:0)
==3295017==    by 0x46E28B9: call_init.part.0 (in /usr/lib64/ld-2.28.so)
==3295017==    by 0x46E29B9: _dl_init (in /usr/lib64/ld-2.28.so)
==3295017==    by 0x5BE427B: _dl_catch_exception (in /usr/lib64/libc-2.28.so)
==3295017==    by 0x46E6E8D: dl_open_worker (in /usr/lib64/ld-2.28.so)
==3295017==    by 0x5BE4223: _dl_catch_exception (in /usr/lib64/libc-2.28.so)
==3295017==    by 0x46E66B0: _dl_open (in /usr/lib64/ld-2.28.so)
==3295017==    by 0x53231E9: dlopen_doit (in /usr/lib64/libdl-2.28.so)
==3295017==    by 0x5BE4223: _dl_catch_exception (in /usr/lib64/libc-2.28.so)
==3295017==    by 0x5BE42E2: _dl_catch_error (in /usr/lib64/libc-2.28.so)

david-cortes-intel avatar Jul 07 '25 15:07 david-cortes-intel

/intelci: run

Vika-F avatar Jul 08 '25 10:07 Vika-F