Question: does this require vendor non-free code to work?
Can this be made to work with pure open source tools, or do I have to hunt down vendor-specific tools?
@znmeb In theory it should work with any OpenCL implementation. For example, the package does work with beignet (a completely open source OpenCL implementation which is unfortunately still vendor specific for Intel). Khronos simply maintains and defines the standard and it is usually implemented by hardware vendors like AMD and NVIDIA. Another possible open source implementation of interest is pocl but I haven't attempted that yet. Let me know if you make any attempt with any pure open source OpenCL platform.
I have Ubuntu 16.04 on a laptop with an integrated Intel GPU and a workstation with an AMD/ATI Bonaire XT Radeon HD 7790/8770 / R7 360 / R9 260/360 OEM. Should I try the pocl option?
@znmeb feel free to give pocl a try and let me know how it goes. Otherwise you can install beignet for the Intel gpu and the AMD APP SDK for the AMD gpu. I would love it if pocl would work for all hardware and operating systems.
I cloned the pocl repo - getting a few cmake errors and ran out of troubleshooting time for today. I'm sure it's something simple like missing "-dev" packages but I'm not a cmake user.
@znmeb Apologies for the lack of response on this issue. I have just recently installed pocl on a Ubuntu 16.04 virutal machine. The develop version of gpuR now installs and appears to work properly. However, it still appears to be necessary to set the environmental variable OPENCL_LIB to point to where pocl is installed (default is /usr/local/lib). I think there is some way to have the ICD work better with pocl but I haven't quite figured that out yet.
If you can successfully install pocl, please verify the the 'develop' version installs and works properly for you.
devtools::install_github('cdeterman/gpuR', ref = 'develop')
OK ... I'm on Fedora 25 beta now ... both beignet and pocl are already in the distribution. I'll check this out when I get back to my workstation tonight.
@znmeb any updates on if you could get this to work? I am hoping to finally get the next version of gpuR released and I want to confirm that my testing with pocl can be accomplished by another user.
@cdeterman I'm on Linux Mint 18 now - testing today
Dang - still can't get pocl to compile. I'm looking for a package now.
@cdeterman It turns out it's packaged for 16.10 but not for 16.04. The pocl developers appear to be working with Fedora. Do you have a script for building pocl on 16.04?
@znmeb Not offhand, I built it on the fly. I will see if I can put together a script. Probably a good idea to have on hand anyway.
I found the problem - it needs 'libclang-dev'. Build is running now.
https://github.com/znmeb/gpuR-on-Sarah - tested on workstation (pocl and AMD, standard Linux drivers)
I'll be testing the beignet Intel GPU on the laptop later today.
well ... beignet isn't working on Ubuntu 16.04. I'm not sure what's happening ... the specific Linux driver is i915 and the Intel OpenCL SDK doesn't yet support 16.04. I think it's a Linux kernel issue. I do have the build from the beignet source working but the run-time ain't happening.
I can't switch OS versions till March - my "day job" (volunteer work) is on 16.04. I'm sure all the vendors will catch up to 16.04 eventually.
In any event pocl builds and runs fine from source on 16.04 on both my 8-core AMD workstation and dual-core i5 laptop. beignet would be icing on the cake; odds are I'll be buying a gamer laptop with an NVidia GPU and running Windows 10 Pro before all the software crap's worked out. ;-)
Hmm... I have successfully installed and used Beignet on 14.04. And another user verified the install works on the latest Debian testing release. I would think it should work on 16.04. I will see if I can experiment a little bit.
@znmeb Oh I think I just realized that you mean that beignet DID work when you built it from source just not from the repos? Can you confirm this? Just want to be clear on the status. This would constitute a different note/issue for tracking. I am trying to clear the 1.2.0 version issue so I can release the updated version to CRAN.
- Beignet from Ubuntu 16.04 does not work with my specific Intel GPU. Neither does Beignet compiled from source.
- From what I was able to determine with searches, the issue is in the kernel module for i915. 14.04 (Trusty) is two years old and probably has patches to the kernel. For the 16.04 kernel to get a patch, Intel, Ubuntu and probably Linus need to approve it.
- There is an Intel SDK for OpenCL at https://software.intel.com/en-us/intel-opencl. They support CentOS (I forget which version) and some Ubuntus before 16.04 but not 16.04. One of the forum posts implied it was "coming" but as you probably know engineers don't commit to dates. ;-)
For the record, mesa-opencl-icd from 16.04 does work with the pure open source kernel / Xorg AMD/ATI drivers on my card, which is
Device Name AMD BONAIRE (DRM 2.43.0, LLVM 3.8.0)
Device Vendor AMD
Device Vendor ID 0x1002
Device Version OpenCL 1.1 MESA 11.2.0
Driver Version 11.2.0
Device OpenCL C Version OpenCL C 1.1
Device Type GPU
Device Profile FULL_PROFILE
Max compute units 14
Max clock frequency 1075MHz
Max work item dimensions 3
Max work item sizes 256x256x256
Max work group size 256
Preferred work group size multiple 64
Preferred / native vector sizes
char 16 / 16
short 8 / 8
int 4 / 4
long 2 / 2
half 0 / 0 (n/a)
float 4 / 4
double 2 / 2 (cl_khr_fp64)
Half-precision Floating-point support (n/a)
Single-precision Floating-point support (core)
Denormals No
Infinity and NANs Yes
Round to nearest Yes
Round to zero No
Round to infinity No
IEEE754-2008 fused multiply-add No
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Double-precision Floating-point support (cl_khr_fp64)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Address bits 32, Little-Endian
Global memory size 1073741824 (1024MiB)
Error Correction support No
Max memory allocation 268435456 (256MiB)
Unified memory for Host and Device Yes
Minimum alignment for any data type 128 bytes
Alignment of base address 1024 bits (128 bytes)
Global Memory cache type None
Image support No
Local memory type Local
Local memory size 32768 (32KiB)
Max constant buffer size 268435456 (256MiB)
Max number of constant args 16
Max size of kernel argument 1024
Queue properties
Out-of-order execution No
Profiling Yes
Profiling timer resolution 0ns
Execution capabilities
Run OpenCL kernels Yes
Run native kernels No
Device Available Yes
Compiler Available Yes
Device Extensions cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_fp64