pyopencl
pyopencl copied to clipboard
POCL driver not found when installing in virtualenv
Describe the bug
When pyopencl[pocl]
is installed in a virtual environment on a system with no other OpenCL drivers, the POCL ICD is not found. It's necessary to set the environment variable OCL_ICD_VENDORS
to <path-topyopencl-install>/.libs
to get pyopencl to see PCOL as a driver. This is not documented in the pyopencl documentation, which suggests that simply installing the pyopencl wheel with the pocl extra is sufficient.
To Reproduce Steps to reproduce the behavior:
- Create a new virtual environment on a machine without OpenCL installed:
python3 -m venv /tmp/venv && /tmp/venv/bin/activate
-
pip install pyopencl[pocl]
- Run
python -c 'import pyopencl; pyopencl.get_platforms()'
- See error
pyopencl._cl.LogicError: clGetPlatformIDs failed: PLATFORM_NOT_FOUND_KHR
Expected behavior
get_platforms()
should return a POCL platform.
Environment (please complete the following information):
- OS: Linux Mint 20.3 (based on Ubuntu 20.04LTS)
- ICD Loader and version: libOpenCL-cf4d6695 from pyopencl[pocl] wheel
- ICD and version: libpocl-3a06e60a from pyopencl[pocl] wheel
- CPU/GPU: Intel Core i7-7600U CPU
- Python version: 3.8.10
- PyOpenCL version: 2022.1
Additional context
The same issue is present on a Scientific Linux 7 (RHEL7 clone) system with Python 3.7. On this system the Python executable is provided by Anaconda, but a standard virtual environment created using the venv standard library module is used rather than a conda environment. The workaround of setting OCL_ICD_VENDORS
still works on this system.
The use case is creating virtualenvs to test code using pyopencl, where root access to install system-side OpenCL is not available and the availability of conda is not guaranteed.
The closest I've got to a portable workaround is:
export OCL_ICD_VENDORS=$(python -c 'import os, pyopencl; print(os.path.join(*pyopencl.__path__, ".libs"))')
But this enforces the use of POCL and so isn't a universal solution as it shouldn't be applied to systems which do already have OpenCL installed globally.
Thanks for the report!
The way this is supposed to work is that the loader that's baked into the pyopencl wheel has that search path baked in:
https://github.com/inducer/pyopencl/blob/0b3d0ef92497e6838eea300b974f385f94cb5100/scripts/build-wheels.sh#L43-L44
That points to this patch:
https://github.com/isuruf/ocl-icd/commit/3862386b51930f95d9ad1089f7157a98165d5a6b.patch
Do you have any sense why that scheme isn't working as intended? (Maybe investigate with strace
?)
Attached are two straces. The first is running the following command, without specifying the OCL_ICD_VENDORS
variable:
strace python -c 'import pyopencl; pyopencl.get_platforms()'
The second is with setting the environment variable:
OCL_ICD_VENDORS=/tmp/pocl-venv/lib/python3.8/site-packages/pyopencl/.libs strace python -c 'import pyopencl; pyopencl.get_platforms()'
It looks like the significant difference is from line 3295 of the traces: in the first instance it attempts to open /etc/OpenCL/vendors
which fails with ENOENT
and then attempts a bunch of paths which end in <string>
. In the second case it opens /tmp/pocl-venv/lib/python3.8/site-packages/pyopencl/.libs/
and successfully finds the pocl.icd
file and in turn the POCL driver.
The use of <string>
looks suspiciously like the variable hasn't been defined properly, but I don't know enough about how the system works internally to tell whether this is a problem or not.
What do you get when you run export OCL_ICD_DEBUG=7
and then start the python interpreter?
(pocl-venv) jlovell@jlovell-thinkpad:~$ OCL_ICD_DEBUG=7 python -c 'import pyopencl; pyopencl.get_platforms()'
ocl-icd(ocl_icd_loader.c:737): __initClIcd: Reading icd list from '/etc/OpenCL/vendors'
ocl-icd(ocl_icd_loader.c:1029): clGetPlatformIDs: return: -1001/0xfffffffffffffc17
Traceback (most recent call last):
File "<string>", line 1, in <module>
pyopencl._cl.LogicError: clGetPlatformIDs failed: PLATFORM_NOT_FOUND_KHR
When I manually specify the path to the ICD directory it looks there instead of in /etc/OpenCL/vendors
:
(pocl-venv) jlovell@jlovell-thinkpad:~$ OCL_ICD_DEBUG=7 OCL_ICD_VENDORS=/tmp/pocl-venv/lib/python3.8/site-packages/pyopencl/.libs/ python -c 'import pyopencl; pyopencl.get_platforms()'
ocl-icd(ocl_icd_loader.c:737): __initClIcd: Reading icd list from '/tmp/pocl-venv/lib/python3.8/site-packages/pyopencl/.libs/'
ocl-icd(ocl_icd_loader.c:201): _find_num_icds: return: 1/0x1
ocl-icd(ocl_icd_loader.c:232): _open_driver: Considering file '/tmp/pocl-venv/lib/python3.8/site-packages/pyopencl/.libs//pocl.icd'
ocl-icd(ocl_icd_loader.c:206): _load_icd: Loading ICD 'libpocl-3a06e60a.so'
ocl-icd(ocl_icd_loader.c:210): _load_icd: ICD[0] loaded
ocl-icd(ocl_icd_loader.c:264): _open_driver: return: 1/0x1
ocl-icd(ocl_icd_loader.c:276): _open_drivers: return: 1/0x1
ocl-icd(ocl_icd_loader.c:232): _open_driver: Considering file '/tmp/pocl-venv/lib/python3.8/site-packages/pyopencl/.libs/pocl.icd'
ocl-icd(ocl_icd_loader.c:206): _load_icd: Loading ICD 'libpocl-3a06e60a.so'
ocl-icd(ocl_icd_loader.c:210): _load_icd: ICD[1] loaded
ocl-icd(ocl_icd_loader.c:264): _open_driver: return: 2/0x2
ocl-icd(ocl_icd_loader.c:276): _open_drivers: return: 2/0x2
ocl-icd(ocl_icd_loader.c:433): _find_and_check_platforms: Checking ICD 0/2
ocl-icd(ocl_icd_loader.c:281): _get_function_addr: Looking for function clGetExtensionFunctionAddress
ocl-icd(ocl_icd_loader.c:299): _get_function_addr: return: 139776962235520/0x7f205c2e8c80
ocl-icd(ocl_icd_loader.c:281): _get_function_addr: Looking for function clIcdGetPlatformIDsKHR
ocl-icd(ocl_icd_loader.c:284): _get_function_addr: Missing global symbol 'clIcdGetPlatformIDsKHR' in ICD, should be skipped
ocl-icd(ocl_icd_loader.c:299): _get_function_addr: return: 139776962236064/0x7f205c2e8ea0
ocl-icd(ocl_icd_loader.c:281): _get_function_addr: Looking for function clGetPlatformInfo
ocl-icd(ocl_icd_loader.c:284): _get_function_addr: Missing global symbol 'clGetPlatformInfo' in ICD, should be skipped
ocl-icd(ocl_icd_loader.c:299): _get_function_addr: return: 139776962163152/0x7f205c2d71d0
ocl-icd(ocl_icd_loader.c:482): _find_and_check_platforms: Try to load 1 platforms
ocl-icd(ocl_icd_loader.c:304): _allocate_platforms: Requesting allocation for 1 platforms
ocl-icd(ocl_icd_loader.c:314): _allocate_platforms: return: 1/0x1
ocl-icd(ocl_icd_loader.c:489): _find_and_check_platforms: Checking platform 0
ocl-icd(ocl_icd_loader.c:340): _malloc_clGetPlatformInfo: return: cl_khr_icd
ocl-icd(ocl_icd_loader.c:340): _malloc_clGetPlatformInfo: return: POCL
ocl-icd(ocl_icd_loader.c:559): _find_and_check_platforms: Extension suffix: POCL
ocl-icd(ocl_icd_loader.c:340): _malloc_clGetPlatformInfo: return: FULL_PROFILE
ocl-icd(ocl_icd_loader.c:340): _malloc_clGetPlatformInfo: return: OpenCL 1.2 pocl 1.3 Release, LLVM 7.0.1, SLEEF, DISTRO, POCL_DEBUG
ocl-icd(ocl_icd_loader.c:340): _malloc_clGetPlatformInfo: return: Portable Computing Language
ocl-icd(ocl_icd_loader.c:340): _malloc_clGetPlatformInfo: return: The pocl project
ocl-icd(ocl_icd_loader.c:433): _find_and_check_platforms: Checking ICD 1/2
ocl-icd(ocl_icd_loader.c:281): _get_function_addr: Looking for function clGetExtensionFunctionAddress
ocl-icd(ocl_icd_loader.c:299): _get_function_addr: return: 139776962235520/0x7f205c2e8c80
ocl-icd(ocl_icd_loader.c:281): _get_function_addr: Looking for function clIcdGetPlatformIDsKHR
ocl-icd(ocl_icd_loader.c:284): _get_function_addr: Missing global symbol 'clIcdGetPlatformIDsKHR' in ICD, should be skipped
ocl-icd(ocl_icd_loader.c:299): _get_function_addr: return: 139776962236064/0x7f205c2e8ea0
ocl-icd(ocl_icd_loader.c:281): _get_function_addr: Looking for function clGetPlatformInfo
ocl-icd(ocl_icd_loader.c:284): _get_function_addr: Missing global symbol 'clGetPlatformInfo' in ICD, should be skipped
ocl-icd(ocl_icd_loader.c:299): _get_function_addr: return: 139776962163152/0x7f205c2d71d0
ocl-icd(ocl_icd_loader.c:482): _find_and_check_platforms: Try to load 1 platforms
ocl-icd(ocl_icd_loader.c:304): _allocate_platforms: Requesting allocation for 1 platforms
ocl-icd(ocl_icd_loader.c:314): _allocate_platforms: return: 1/0x1
ocl-icd(ocl_icd_loader.c:489): _find_and_check_platforms: Checking platform 0
ocl-icd(ocl_icd_loader.c:340): _malloc_clGetPlatformInfo: return: cl_khr_icd
ocl-icd(ocl_icd_loader.c:340): _malloc_clGetPlatformInfo: return: POCL
ocl-icd(ocl_icd_loader.c:559): _find_and_check_platforms: Extension suffix: POCL
ocl-icd(ocl_icd_loader.c:340): _malloc_clGetPlatformInfo: return: FULL_PROFILE
ocl-icd(ocl_icd_loader.c:340): _malloc_clGetPlatformInfo: return: OpenCL 1.2 pocl 1.3 Release, LLVM 7.0.1, SLEEF, DISTRO, POCL_DEBUG
ocl-icd(ocl_icd_loader.c:340): _malloc_clGetPlatformInfo: return: Portable Computing Language
ocl-icd(ocl_icd_loader.c:340): _malloc_clGetPlatformInfo: return: The pocl project
ocl-icd(ocl_icd_loader.c:387): _sort_platforms: Nb platefroms: 2
ocl-icd(ocl_icd_loader.c:398): _sort_platforms: Platform sorted by GPU, CPU, DEV
ocl-icd(ocl_icd_loader.c:793): __initClIcd: 2 valid vendor(s)!
ocl-icd(ocl_icd_loader.c:1025): clGetPlatformIDs: Entering
Same behaviour in the interactive python interpreter.
Manually setting PYOPENCL_HOME
before starting python also fails in the same way as if it is not set. So it looks like the environment variable isn't getting picked up.
Definitely using the wheel-provided libOpenCL too, so it should have the patch you mentioned. Grepping that SO does indicate it has PYOPENCL_HOME
inside the library.
(pocl-venv) jlovell@jlovell-thinkpad:~$ ldd /tmp/pocl-venv/lib/python3.8/site-packages/pyopencl/_cl.cpython-38-x86_64-linux-gnu.so
linux-vdso.so.1 (0x00007ffd960ea000)
libOpenCL-cf4d6695.so.1.0.0 => /tmp/pocl-venv/lib/python3.8/site-packages/pyopencl/.libs/libOpenCL-cf4d6695.so.1.0.0 (0x00007f1b1e233000)
libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f1b1e024000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f1b1ded5000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f1b1deba000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f1b1de97000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f1b1dca5000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f1b1dc9d000)
/lib64/ld-linux-x86-64.so.2 (0x00007f1b1e37a000
That's mysterious. Why does it say "missing global symbol" and then return an address for it? And why does this work on other systems?
I've been able to reproduce this using Github Actions: compare https://github.com/cherab/core/runs/5290607772?check_suite_focus=true where I didn't properly set the OCL_ICD_VENDORS
environment variable for the job with https://github.com/cherab/core/runs/5291207783?check_suite_focus=true where I managed to do it correctly. So it should be possible for you to reproduce this too for testing.
I'm afraid I don't know enough about the OpenCL loader to speculate on why it's doing this on some systems but not others.
Update: the workaround of manually setting OCL_ICD_VENDORS
no longer works with pyopencl 2022.2.3
Could you use some of the same troubleshooting techniques (strace, ldd) to see why that might be happening?
Fixed in https://github.com/inducer/pyopencl/pull/635
Seems to work with the wheels in the #635 build artifacts, thanks!
Took a bit of trial and error, as I hadn't realised that the pocl ICD was added to site-packages/pyopencl/.libs
by pocl-binary-distribution
and not pyopencl
, then got confused as to why those files were missing after uninstalling the previous version of pyopencl and deleting the pyopencl
directory entirely in site-packages. Reinstalling pocl-binary-distribution
along with the patched version of pyopencl
fixed things.
I can also confirm that it's no longer necessary with #635 to manually set OCL_ICD_VENDORS
for the ICD to be picked up.