compute-runtime icon indicating copy to clipboard operation
compute-runtime copied to clipboard

Building Git head version of "zello_sysman.cpp" fails

Open eero-t opened this issue 1 year ago • 4 comments

While latest release version (23.05.25593.9) works fine, building Git head version of zello_sysman: g++ -O2 -Wall -o zello_sysman zello_sysman.cpp -lze_loader -locloc

Fails to following errors:

zello_sysman.cpp: In function 'void getSysmanDeviceHandles(_ze_driver_handle_t*&, std::vector<_ze_device_handle_t*>&)':
zello_sysman.cpp:166:18: error: 'zesInit' was not declared in this scope; did you mean 'zeInit'?
  166 |     VALIDATECALL(zesInit(0));
      |                  ^~~~~~~
zello_sysman.cpp:94:25: note: in definition of macro 'VALIDATECALL'
   94 |         ze_result_t r = myZeCall;             \
      |                         ^~~~~~~~
zello_sysman.cpp:169:18: error: 'zesDriverGet' was not declared in this scope; did you mean 'zeDriverGet'?
  169 |     VALIDATECALL(zesDriverGet(&driverCount, nullptr));
      |                  ^~~~~~~~~~~~
zello_sysman.cpp:94:25: note: in definition of macro 'VALIDATECALL'
   94 |         ze_result_t r = myZeCall;             \
      |                         ^~~~~~~~
zello_sysman.cpp:174:18: error: 'zesDriverGet' was not declared in this scope; did you mean 'zeDriverGet'?
  174 |     VALIDATECALL(zesDriverGet(&driverCount, &sysmanDriverHandle));
      |                  ^~~~~~~~~~~~
zello_sysman.cpp:94:25: note: in definition of macro 'VALIDATECALL'
   94 |         ze_result_t r = myZeCall;             \
      |                         ^~~~~~~~
zello_sysman.cpp:177:18: error: 'zesDeviceGet' was not declared in this scope; did you mean 'zeDeviceGet'?
  177 |     VALIDATECALL(zesDeviceGet(sysmanDriverHandle, &deviceCount, nullptr));
      |                  ^~~~~~~~~~~~
zello_sysman.cpp:94:25: note: in definition of macro 'VALIDATECALL'
   94 |         ze_result_t r = myZeCall;             \
      |                         ^~~~~~~~
zello_sysman.cpp:183:18: error: 'zesDeviceGet' was not declared in this scope; did you mean 'zeDeviceGet'?
  183 |     VALIDATECALL(zesDeviceGet(sysmanDriverHandle, &deviceCount, sysmanDevices.data()));
      |                  ^~~~~~~~~~~~
zello_sysman.cpp:94:25: note: in definition of macro 'VALIDATECALL'
   94 |         ze_result_t r = myZeCall;             \
      |                         ^~~~~~~~
zello_sysman.cpp: In function 'void testSysmanFabricPort(_ze_device_handle_t*&)':
zello_sysman.cpp:1147:9: error: 'zes_fabric_port_error_counters_t' was not declared in this scope; did you mean 'zes_fabric_port_throughput_t'?
 1147 |         zes_fabric_port_error_counters_t fabricPortErrorCounters = {};
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |         zes_fabric_port_throughput_t
zello_sysman.cpp:1197:67: error: 'fabricPortErrorCounters' was not declared in this scope
 1197 |         VALIDATECALL(zesFabricPortGetFabricErrorCounters(handle, &fabricPortErrorCounters));
      |                                                                   ^~~~~~~~~~~~~~~~~~~~~~~
zello_sysman.cpp:94:25: note: in definition of macro 'VALIDATECALL'
   94 |         ze_result_t r = myZeCall;             \
      |                         ^~~~~~~~
zello_sysman.cpp:1197:22: error: 'zesFabricPortGetFabricErrorCounters' was not declared in this scope
 1197 |         VALIDATECALL(zesFabricPortGetFabricErrorCounters(handle, &fabricPortErrorCounters));
      |                      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
zello_sysman.cpp:94:25: note: in definition of macro 'VALIDATECALL'
   94 |         ze_result_t r = myZeCall;             \
      |                         ^~~~~~~~
zello_sysman.cpp:1199:48: error: 'fabricPortErrorCounters' was not declared in this scope
 1199 |             std::cout << "Link Failures = " << fabricPortErrorCounters.linkFailureCount << std::endl;
      |                                                ^~~~~~~~~~~~~~~~~~~~~~~

Note: as this is very useful tool for testing Sysman implementation, I'm building it against release version of L0, not against Git head version. Others can do the same, as this tool has been mentioned in different public Sysman related tickets (not in just this project).

eero-t avatar Mar 10 '23 11:03 eero-t

Hi @eero-t Please ensure you have level zero headers in version that is pointed in manifest of neo repo as of commit you are building on https://github.com/intel/compute-runtime/blob/master/manifests/manifest.yml#L51

JablonskiMateusz avatar Mar 13 '23 09:03 JablonskiMateusz

Ok, so it requires the latest L0 v1.9.9 release, released 10 days ago. I assume it should work also with newer versions?

Because this tools is of general use, also with distro versions of L0 & compute-runtime, I think it would be good to have better warning when L0 versions is too old, something like this in the code:

#if !defined(L0_HEADER_VERSION) || L0_HEADER_VERSION < 10909
#  error "v1.9.9 or newer L0 headers required"
#endif 

However, L0 headers seem to be missing defines that could be used for such check. I've added ticket https://github.com/oneapi-src/level-zero/issues/111 for that. Let's keep this open until that is resolved.

eero-t avatar Mar 13 '23 11:03 eero-t

After such defines are provided by L0, zello_sysman could also use suitable ifdefs to build support for newer L0 features only when system L0 version (headers) support them.

This would require occasionally removing older ifdefs [1] to keep code maintainable, so it might be too much effort compared to just updating required L0_HEADER_VERSION error check.

[1] E.g. when distros upgrade their L0 versions against which zello_sysman might get compiled against.

eero-t avatar Mar 13 '23 11:03 eero-t

Ok, so it requires the latest L0 v1.9.9 release, released 10 days ago. I assume it should work also with newer versions?

When checking the L0 APIs more carefully, spec version 1.5.0, added in L0 frontend v1.9, should actually suffice.

Based on that, and info from https://github.com/oneapi-src/level-zero/issues/111, adding something like this (untested):

#if ZE_API_VERSION_CURRENT < ZE_MAKE_VERSION( 1, 5 )
# error "zello_sysman.cpp requires L0 API >= v1.5"
#endif

to zello_systman.cpp would be enough to fix this ticket (no additional L0 API extension version checks are needed).

PS. Intel repos apparently have still only v1.8.x frontend version (as I was using those when I got this issue), as does Debian: https://salsa.debian.org/debian/level-zero/-/blob/master/debian/changelog

Whereas latest Fedora has already v1.9.x L0 frontend in F37 and newer: https://src.fedoraproject.org/rpms/oneapi-level-zero

eero-t avatar Mar 13 '23 14:03 eero-t