gpuR icon indicating copy to clipboard operation
gpuR copied to clipboard

Question: does this require vendor non-free code to work?

Open znmeb opened this issue 9 years ago • 17 comments

Can this be made to work with pure open source tools, or do I have to hunt down vendor-specific tools?

znmeb avatar May 05 '16 23:05 znmeb

@znmeb In theory it should work with any OpenCL implementation. For example, the package does work with beignet (a completely open source OpenCL implementation which is unfortunately still vendor specific for Intel). Khronos simply maintains and defines the standard and it is usually implemented by hardware vendors like AMD and NVIDIA. Another possible open source implementation of interest is pocl but I haven't attempted that yet. Let me know if you make any attempt with any pure open source OpenCL platform.

cdeterman avatar May 06 '16 12:05 cdeterman

I have Ubuntu 16.04 on a laptop with an integrated Intel GPU and a workstation with an AMD/ATI Bonaire XT Radeon HD 7790/8770 / R7 360 / R9 260/360 OEM. Should I try the pocl option?

znmeb avatar May 06 '16 20:05 znmeb

@znmeb feel free to give pocl a try and let me know how it goes. Otherwise you can install beignet for the Intel gpu and the AMD APP SDK for the AMD gpu. I would love it if pocl would work for all hardware and operating systems.

cdeterman avatar May 07 '16 00:05 cdeterman

I cloned the pocl repo - getting a few cmake errors and ran out of troubleshooting time for today. I'm sure it's something simple like missing "-dev" packages but I'm not a cmake user.

znmeb avatar May 07 '16 04:05 znmeb

@znmeb Apologies for the lack of response on this issue. I have just recently installed pocl on a Ubuntu 16.04 virutal machine. The develop version of gpuR now installs and appears to work properly. However, it still appears to be necessary to set the environmental variable OPENCL_LIB to point to where pocl is installed (default is /usr/local/lib). I think there is some way to have the ICD work better with pocl but I haven't quite figured that out yet.

If you can successfully install pocl, please verify the the 'develop' version installs and works properly for you.

devtools::install_github('cdeterman/gpuR', ref = 'develop')

cdeterman avatar Oct 19 '16 19:10 cdeterman

OK ... I'm on Fedora 25 beta now ... both beignet and pocl are already in the distribution. I'll check this out when I get back to my workstation tonight.

znmeb avatar Oct 19 '16 22:10 znmeb

@znmeb any updates on if you could get this to work? I am hoping to finally get the next version of gpuR released and I want to confirm that my testing with pocl can be accomplished by another user.

cdeterman avatar Nov 01 '16 14:11 cdeterman

@cdeterman I'm on Linux Mint 18 now - testing today

Dang - still can't get pocl to compile. I'm looking for a package now.

znmeb avatar Nov 01 '16 16:11 znmeb

@cdeterman It turns out it's packaged for 16.10 but not for 16.04. The pocl developers appear to be working with Fedora. Do you have a script for building pocl on 16.04?

znmeb avatar Nov 01 '16 17:11 znmeb

@znmeb Not offhand, I built it on the fly. I will see if I can put together a script. Probably a good idea to have on hand anyway.

cdeterman avatar Nov 01 '16 17:11 cdeterman

I found the problem - it needs 'libclang-dev'. Build is running now.

znmeb avatar Nov 01 '16 17:11 znmeb

https://github.com/znmeb/gpuR-on-Sarah - tested on workstation (pocl and AMD, standard Linux drivers)

I'll be testing the beignet Intel GPU on the laptop later today.

znmeb avatar Nov 01 '16 19:11 znmeb

well ... beignet isn't working on Ubuntu 16.04. I'm not sure what's happening ... the specific Linux driver is i915 and the Intel OpenCL SDK doesn't yet support 16.04. I think it's a Linux kernel issue. I do have the build from the beignet source working but the run-time ain't happening.

I can't switch OS versions till March - my "day job" (volunteer work) is on 16.04. I'm sure all the vendors will catch up to 16.04 eventually.

In any event pocl builds and runs fine from source on 16.04 on both my 8-core AMD workstation and dual-core i5 laptop. beignet would be icing on the cake; odds are I'll be buying a gamer laptop with an NVidia GPU and running Windows 10 Pro before all the software crap's worked out. ;-)

znmeb avatar Nov 02 '16 00:11 znmeb

Hmm... I have successfully installed and used Beignet on 14.04. And another user verified the install works on the latest Debian testing release. I would think it should work on 16.04. I will see if I can experiment a little bit.

cdeterman avatar Nov 02 '16 13:11 cdeterman

@znmeb Oh I think I just realized that you mean that beignet DID work when you built it from source just not from the repos? Can you confirm this? Just want to be clear on the status. This would constitute a different note/issue for tracking. I am trying to clear the 1.2.0 version issue so I can release the updated version to CRAN.

cdeterman avatar Nov 02 '16 20:11 cdeterman

  1. Beignet from Ubuntu 16.04 does not work with my specific Intel GPU. Neither does Beignet compiled from source.
  2. From what I was able to determine with searches, the issue is in the kernel module for i915. 14.04 (Trusty) is two years old and probably has patches to the kernel. For the 16.04 kernel to get a patch, Intel, Ubuntu and probably Linus need to approve it.
  3. There is an Intel SDK for OpenCL at https://software.intel.com/en-us/intel-opencl. They support CentOS (I forget which version) and some Ubuntus before 16.04 but not 16.04. One of the forum posts implied it was "coming" but as you probably know engineers don't commit to dates. ;-)

znmeb avatar Nov 02 '16 21:11 znmeb

For the record, mesa-opencl-icd from 16.04 does work with the pure open source kernel / Xorg AMD/ATI drivers on my card, which is

  Device Name                                     AMD BONAIRE (DRM 2.43.0, LLVM 3.8.0)
  Device Vendor                                   AMD
  Device Vendor ID                                0x1002
  Device Version                                  OpenCL 1.1 MESA 11.2.0
  Driver Version                                  11.2.0
  Device OpenCL C Version                         OpenCL C 1.1 
  Device Type                                     GPU
  Device Profile                                  FULL_PROFILE
  Max compute units                               14
  Max clock frequency                             1075MHz
  Max work item dimensions                        3
  Max work item sizes                             256x256x256
  Max work group size                             256
  Preferred work group size multiple              64
  Preferred / native vector sizes                 
    char                                                16 / 16      
    short                                                8 / 8       
    int                                                  4 / 4       
    long                                                 2 / 2       
    half                                                 0 / 0        (n/a)
    float                                                4 / 4       
    double                                               2 / 2        (cl_khr_fp64)
  Half-precision Floating-point support           (n/a)
  Single-precision Floating-point support         (core)
    Denormals                                     No
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 No
    Round to infinity                             No
    IEEE754-2008 fused multiply-add               No
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Address bits                                    32, Little-Endian
  Global memory size                              1073741824 (1024MiB)
  Error Correction support                        No
  Max memory allocation                           268435456 (256MiB)
  Unified memory for Host and Device              Yes
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       1024 bits (128 bytes)
  Global Memory cache type                        None
  Image support                                   No
  Local memory type                               Local
  Local memory size                               32768 (32KiB)
  Max constant buffer size                        268435456 (256MiB)
  Max number of constant args                     16
  Max size of kernel argument                     1024
  Queue properties                                
    Out-of-order execution                        No
    Profiling                                     Yes
  Profiling timer resolution                      0ns
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            No
  Device Available                                Yes
  Compiler Available                              Yes
  Device Extensions                               cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_fp64

znmeb avatar Nov 02 '16 22:11 znmeb