OpenCL-ICD-Loader icon indicating copy to clipboard operation
OpenCL-ICD-Loader copied to clipboard

Memory leaks in libOpenCL.so.1!khrIcdVendorAdd

Open MichaelTutin opened this issue 3 years ago • 7 comments

Intel Inspector detects memory leaks for two allocations in khrIcdVendorAdd():

  1. Error: Memory leak: 16 Bytes:

libOpenCL.so.1!khrIcdVendorAdd - /OpenCL-ICD-Loader/loader/icd.c:135 libOpenCL.so.1!khrIcdOsVendorsEnumerate - /OpenCL-ICD-Loader/loader/linux/icd_linux.c:125

  1. Error: Memory leak: 80 Bytes

libOpenCL.so.1!khrIcdVendorAdd - /OpenCL-ICD-Loader/loader/icd.c:153 libOpenCL.so.1!khrIcdOsVendorsEnumerate - /OpenCL-ICD-Loader/loader/linux/icd_linux.c:125

Looking at the code in icd.c is seems that vendor structure is initialized and added into khrIcdVendors global list, but there is no explicit deinitialization function that would free those elements. Probably one of the options to fix that issue would be to add a function with _attribute_((destructor)) on Linux or DllMain() handler on Windows to handle library unload event and deallocate those blocks.

MichaelTutin avatar Nov 12 '21 10:11 MichaelTutin

Thanks for reporting the issue. The termination of the loader is an open problem. I'll try to summarize the issues here:

  • We leak memory,
  • Libraries are not unloaded (drivers, layers),
  • We don't have a termination API for layers (so they must use at_exit() for termination).

Adding termination to the loader nonetheless may present a risk that user application (less likely) or libraries (more likely) would break, especially those that rely on at_exit or similar callbacks (C++ static destructors?) that call OpenCL APIs (usually to release OpenCL resources), and that would be called after the loader is de-initialized.

I think this topic needs to be addressed, but needs careful discussion/consideration. It should be noted that race conditions in shared libraries loading/unloading is not a subject I am particularly confident discussing, so everything I say here should be taken with a grain of salt.

note: ocl-icd is behaving similarly.

note: the layer infrastructure may be able to help here: we could add a no-op stub/layer once we enter de-initialization (and either always return SUCCESS or a dedicated error code?). This should prevent segfaults inside the loader. I think CUDA has a similar mechanism.

Kerilk avatar Nov 12 '21 16:11 Kerilk

Any updates on this issue?

Yanfeng-Mi avatar Nov 17 '22 03:11 Yanfeng-Mi

Is there any update on fixing this issue? Thanks!

wenjiew avatar Mar 07 '23 04:03 wenjiew

How can we fix this issue?

Rikyu1 avatar Mar 15 '23 10:03 Rikyu1

Created PR with fix: https://github.com/KhronosGroup/OpenCL-ICD-Loader/pull/217

JablonskiMateusz avatar Apr 11 '23 12:04 JablonskiMateusz

One item we briefly discussed in last week's teleconference: We recently added the ability to have "loader extensions" (see the ICD loader info PR https://github.com/KhronosGroup/OpenCL-ICD-Loader/pull/188). Could we add a similar "loader extension" that allows an application to safely and explicitly de-initialize the ICD loader on the application's schedule, rather than de-initializing the ICD loader implicitly when the ICD loader is unloaded?

This isn't a perfect solution because it requires an explicit call to "clean up", but it avoids some of the unloading race condition concerns described above.

bashbaug avatar May 02 '23 15:05 bashbaug

We meet the same issue on Windows. Load the library, call clGetPlatformIDs, free library, then memory leak happen.

MicroYY avatar Jul 23 '24 07:07 MicroYY