neural-api icon indicating copy to clipboard operation
neural-api copied to clipboard

OpenCL Build error with Intel HD Graphics 620

Open JeanYvesJonet opened this issue 5 years ago • 9 comments

Hello, I have a problem when i want to use OpenCL with Intel HD Graphics 620 (no problem with CUDA) OpenCL version 2.1. No error when i test with Geeks 3D. OpenCL is running good. But when the program is building with opencl, the console windows show: clCreateContext OK! clCreateCommandQueue OK! ClCreateProgramWithSource OK! Error: Failed to build programm executable : -11 Error: Failed to create compute kernel : cai_dot_product .... And the programm continue to use OpenCL compute with lot's of errors ... Do you know what is the problem into neural.cl ? and why the program continue to use opencl with this error ? Thank you for your help JY

JeanYvesJonet avatar Nov 17 '20 14:11 JeanYvesJonet

Hi - can you confirm please if you are using Linux or Windows? FPC or Delphi?

Are you trying any example supplied with the API?

joaopauloschuler avatar Nov 18 '20 06:11 joaopauloschuler

Hi, I use Windows 10 pro 64bits and Delphi 10.3 But I don't try an example supplied with the api. I try only Geek3D test opencl, and it's OKay. Thanks

JeanYvesJonet avatar Nov 18 '20 07:11 JeanYvesJonet

Are you able to compile and run this example with Lazarus and CIFAR-10 dataset please? https://sourceforge.net/p/cai/svncode/HEAD/tree/trunk/lazarus/experiments/visualCifar10OpenCL/

You'll need the binary from this dataset: https://www.cs.toronto.edu/~kriz/cifar.html

And you'll need Lazarus: https://www.lazarus-ide.org/

joaopauloschuler avatar Nov 22 '20 10:11 joaopauloschuler

Hello, I try it, but same error. I updated Intel HD Driver to the new (october 2020), now it's opencl 3.0. Same problem. It's difficult to stop errors reporting, so much errors into console system. (break/break/..... to able to stop)

Find informations and errors ......

Platform info: 0 --------------------- PROFILE: FULL_PROFILE VERSION: OpenCL 3.0 NAME: Intel(R) OpenCL HD Graphics VENDOR: Intel(R) Corporation EXTENSIONS: Intel(R) Corporation Device info: 0 ------------ DEVICE NAME: Intel(R) HD Graphics 620 DEVICE VENDOR: Intel(R) Corporation DEVICE VERSION: OpenCL 3.0 NEO DEVICE PROFILE: FULL_PROFILE DEVICE EXTENSIONS: FULL_PROFILE DEVICE TYPE: 4 DEVICE MAX WORK GROUP SIZE: 256 DEVICE MAX COMPUTE UNITS: 23 DEVICE IMAGE3D MAX WIDTH: 16384 DEVICE IMAGE3D MAX HEIGHT: 16384 DEVICE GLOBAL MEM SIZE: 1717985280 DEVICE LOCAL MEM SIZE: 65536 DEVICE COMPILER AVAILABLE: 1 DEVICE MAX CONSTANT BUFFER SIZE: 858992640 DEVICE MAX CONSTANT ARGS: 8

Training Images:40000 Test Images:10000 Validation Images:10000 Creating Neural Network... Neural network has Softmax. Setting L2 to:0.0000 Learning rate:0.0010 Staircase ephocs:50 clCreateContext OK! clCreateCommandQueue OK! clCreateProgramWithSource OK! Error: Failed to build program executable:-11 Error: Failed to create compute kernel:cai_dot_product Has OpenCL:TRUE Should OpenCL:TRUE Current layer:1 Layer:1 Vector:75 Neuron count:64 Output size:65536 Has OpenCL:TRUE Should OpenCL:TRUE Current layer:2

Thank you very much for your help.

JeanYvesJonet avatar Nov 24 '20 15:11 JeanYvesJonet

I'll prepare a test case for you to run on your environment with more logs. So, at least, we'll get closer to a diagnostic.

joaopauloschuler avatar Feb 07 '21 11:02 joaopauloschuler

Hi, can you please retest it with the latest version? I suspect that it might be related with another recently reported bug: https://github.com/joaopauloschuler/neural-api/issues/41

joaopauloschuler avatar Feb 07 '21 17:02 joaopauloschuler

Hello, I tried it with sample Lazarus, it's seem to be good now. Good work 👍 . I tried it with delphi 10.3 , no build error, but same problem. Thanks.

Delphi console : clCreateContext OK! clCreateCommandQueue OK! clCreateProgramWithSource OK! Error: Failed to build program executable:-11 Error: Failed to create compute kernel:cai_dot_product

Lazarus console : Platform info: 0 --------------------- PROFILE: FULL_PROFILE VERSION: OpenCL 3.0 NAME: Intel(R) OpenCL HD Graphics VENDOR: Intel(R) Corporation EXTENSIONS: Intel(R) Corporation Device info: 0 ------------ DEVICE NAME: Intel(R) HD Graphics 620 DEVICE VENDOR: Intel(R) Corporation DEVICE VERSION: OpenCL 3.0 NEO DEVICE PROFILE: FULL_PROFILE DEVICE EXTENSIONS: FULL_PROFILE DEVICE TYPE: 4 DEVICE MAX WORK GROUP SIZE: 256 DEVICE MAX COMPUTE UNITS: 23 DEVICE IMAGE3D MAX WIDTH: 16384 DEVICE IMAGE3D MAX HEIGHT: 16384 DEVICE GLOBAL MEM SIZE: 1717985280 DEVICE LOCAL MEM SIZE: 65536 DEVICE COMPILER AVAILABLE: 1 DEVICE MAX CONSTANT BUFFER SIZE: 858992640 DEVICE MAX CONSTANT ARGS: 8 Number of threads:8 Algorithm:0 Color Encoding:0 Input Channels:3 Step Size:128 File name is: autosave-0.001-0.01-0.9-0-64-5-4-3-1-2-64-2-32-F-8-128-RGB-Softmax Loading 10K images from file "data_batch_1.bin" ... GLOBAL MIN MAX -2.0000 1.9844 -2.0000 1.9844 -2.0000 1.9844 Done. Loading 10K images from file "data_batch_2.bin" ... GLOBAL MIN MAX -2.0000 1.9844 -2.0000 1.9844 -2.0000 1.9844 Done. Loading 10K images from file "data_batch_3.bin" ... GLOBAL MIN MAX -2.0000 1.9844 -2.0000 1.9844 -2.0000 1.9844 Done. Loading 10K images from file "data_batch_4.bin" ... GLOBAL MIN MAX -2.0000 1.9844 -2.0000 1.9844 -2.0000 1.9844 Done. Loading 10K images from file "data_batch_5.bin" ... GLOBAL MIN MAX -2.0000 1.9844 -2.0000 1.9844 -2.0000 1.9844 Done. Loading 10K images from file "test_batch.bin" ... GLOBAL MIN MAX -2.0000 1.9844 -2.0000 1.9844 -2.0000 1.9844 Done. Training Images:40000 Test Images:10000 Validation Images:10000 Creating Neural Network... Neural network has Softmax. Setting L2 to:0.0000 Learning rate:0.0010 Staircase ephocs:50 clCreateContext OK! clCreateCommandQueue OK! clCreateProgramWithSource OK! clBuildProgram OK! clCreateKernel cai_dot_product OK! Has OpenCL:TRUE Should OpenCL:TRUE Current layer:1 Layer:1 Vector:75 Neuron count:64 Output size:65536 Has OpenCL:TRUE Should OpenCL:TRUE Current layer:2 Layer:2 Vector:576 Neuron count:64 Output size:65536 Has OpenCL:TRUE Should OpenCL:TRUE Current layer:4 Layer:4 Vector:576 Neuron count:64 Output size:4096 Has OpenCL:TRUE Should OpenCL:TRUE Current layer:5 Layer:5 Vector:576 Neuron count:64 Output size:4096 Has OpenCL:TRUE Should OpenCL:FALSE Current layer:7 Has OpenCL:TRUE Should OpenCL:FALSE Current layer:8 Has OpenCL:TRUE Should OpenCL:FALSE Current layer:9 clCreateContext OK! clCreateCommandQueue OK! clCreateProgramWithSource OK! clBuildProgram OK! clCreateKernel cai_dot_product OK! Has OpenCL:TRUE Should OpenCL:TRUE Current layer:1 Layer:1 Vector:75 Neuron count:64 Output size:65536 Has OpenCL:TRUE Should OpenCL:TRUE Current layer:2 Layer:2 Vector:576 Neuron count:64 Output size:65536 Has OpenCL:TRUE Should OpenCL:TRUE Current layer:4 Layer:4 Vector:576 Neuron count:64 Output size:4096 Has OpenCL:TRUE Should OpenCL:TRUE Current layer:5 Layer:5 Vector:576 Neuron count:64 Output size:4096 Has OpenCL:TRUE Should OpenCL:FALSE Current layer:7 Has OpenCL:TRUE Should OpenCL:FALSE Current layer:8 Has OpenCL:TRUE Should OpenCL:FALSE Current layer:9 clCreateContext OK! clCreateCommandQueue OK! clCreateProgramWithSource OK! clBuildProgram OK! clCreateKernel cai_dot_product OK! Has OpenCL:TRUE Should OpenCL:TRUE Current layer:1 Layer:1 Vector:75 Neuron count:64 Output size:65536 Has OpenCL:TRUE Should OpenCL:TRUE Current layer:2 Layer:2 Vector:576 Neuron count:64 Output size:65536 Has OpenCL:TRUE Should OpenCL:TRUE Current layer:4 Layer:4 Vector:576 Neuron count:64 Output size:4096 Has OpenCL:TRUE Should OpenCL:TRUE Current layer:5 Layer:5 Vector:576 Neuron count:64 Output size:4096 Has OpenCL:TRUE Should OpenCL:FALSE Current layer:7 Has OpenCL:TRUE Should OpenCL:FALSE Current layer:8 Has OpenCL:TRUE Should OpenCL:FALSE Current layer:9 clCreateContext OK! clCreateCommandQueue OK! clCreateProgramWithSource OK! clBuildProgram OK! clCreateKernel cai_dot_product OK! Has OpenCL:TRUE Should OpenCL:TRUE Current layer:1 Layer:1 Vector:75 Neuron count:64 Output size:65536 Has OpenCL:TRUE Should OpenCL:TRUE Current layer:2 Layer:2 Vector:576 Neuron count:64 Output size:65536 Has OpenCL:TRUE Should OpenCL:TRUE Current layer:4 Layer:4 Vector:576 Neuron count:64 Output size:4096 Has OpenCL:TRUE Should OpenCL:TRUE Current layer:5 Layer:5 Vector:576 Neuron count:64 Output size:4096 Has OpenCL:TRUE Should OpenCL:FALSE Current layer:7 Has OpenCL:TRUE Should OpenCL:FALSE Current layer:8 Has OpenCL:TRUE Should OpenCL:FALSE Current layer:9 clCreateContext OK! clCreateCommandQueue OK! clCreateProgramWithSource OK! clBuildProgram OK! clCreateKernel cai_dot_product OK! Has OpenCL:TRUE Should OpenCL:TRUE Current layer:1 Layer:1 Vector:75 Neuron count:64 Output size:65536 Has OpenCL:TRUE Should OpenCL:TRUE Current layer:2 Layer:2 Vector:576 Neuron count:64 Output size:65536 Has OpenCL:TRUE Should OpenCL:TRUE Current layer:4 Layer:4 Vector:576 Neuron count:64 Output size:4096 Has OpenCL:TRUE Should OpenCL:TRUE Current layer:5 Layer:5 Vector:576 Neuron count:64 Output size:4096 Has OpenCL:TRUE Should OpenCL:FALSE Current layer:7 Has OpenCL:TRUE Should OpenCL:FALSE Current layer:8 Has OpenCL:TRUE Should OpenCL:FALSE Current layer:9 clCreateContext OK! clCreateCommandQueue OK! clCreateProgramWithSource OK! clBuildProgram OK! clCreateKernel cai_dot_product OK! Has OpenCL:TRUE Should OpenCL:TRUE Current layer:1 Layer:1 Vector:75 Neuron count:64 Output size:65536 Has OpenCL:TRUE Should OpenCL:TRUE Current layer:2 Layer:2 Vector:576 Neuron count:64 Output size:65536 Has OpenCL:TRUE Should OpenCL:TRUE Current layer:4 Layer:4 Vector:576 Neuron count:64 Output size:4096 Has OpenCL:TRUE Should OpenCL:TRUE Current layer:5 Layer:5 Vector:576 Neuron count:64 Output size:4096 Has OpenCL:TRUE Should OpenCL:FALSE Current layer:7 Has OpenCL:TRUE Should OpenCL:FALSE Current layer:8 Has OpenCL:TRUE Should OpenCL:FALSE Current layer:9 clCreateContext OK! clCreateCommandQueue OK! clCreateProgramWithSource OK! clBuildProgram OK! clCreateKernel cai_dot_product OK! Has OpenCL:TRUE Should OpenCL:TRUE Current layer:1 Layer:1 Vector:75 Neuron count:64 Output size:65536 Has OpenCL:TRUE Should OpenCL:TRUE Current layer:2 Layer:2 Vector:576 Neuron count:64 Output size:65536 Has OpenCL:TRUE Should OpenCL:TRUE Current layer:4 Layer:4 Vector:576 Neuron count:64 Output size:4096 Has OpenCL:TRUE Should OpenCL:TRUE Current layer:5 Layer:5 Vector:576 Neuron count:64 Output size:4096 Has OpenCL:TRUE Should OpenCL:FALSE Current layer:7 Has OpenCL:TRUE Should OpenCL:FALSE Current layer:8 Has OpenCL:TRUE Should OpenCL:FALSE Current layer:9 clCreateContext OK! clCreateCommandQueue OK! clCreateProgramWithSource OK! clBuildProgram OK! clCreateKernel cai_dot_product OK! Has OpenCL:TRUE Should OpenCL:TRUE Current layer:1 Layer:1 Vector:75 Neuron count:64 Output size:65536 Has OpenCL:TRUE Should OpenCL:TRUE Current layer:2 Layer:2 Vector:576 Neuron count:64 Output size:65536 Has OpenCL:TRUE Should OpenCL:TRUE Current layer:4 Layer:4 Vector:576 Neuron count:64 Output size:4096 Has OpenCL:TRUE Should OpenCL:TRUE Current layer:5 Layer:5 Vector:576 Neuron count:64 Output size:4096 Has OpenCL:TRUE Should OpenCL:FALSE Current layer:7 Has OpenCL:TRUE Should OpenCL:FALSE Current layer:8 Has OpenCL:TRUE Should OpenCL:FALSE Current layer:9 Layer 0 Max Output: 0.000 Min Output: 0.000 TNNetInput 32,32,3 Times: 0.00s 0.00s Layer 1 Neurons: 64 Max Weight: 0.200 Min Weight: -0.200 Max Output: 0.000 Min Output: 0.000 TNNetConvolutionReLU 32,32,64 Times: 0.00s 0.00s Parent:0 Layer 2 Neurons: 64 Max Weight: 0.072 Min Weight: -0.072 Max Output: 0.000 Min Output: 0.000 TNNetConvolutionLinear 32,32,64 Times: 0.00s 0.00s Parent:1 Layer 3 Max Output: 0.000 Min Output: 0.000 TNNetMaxPool 8,8,64 Times: 0.00s 0.00s Parent:2 Layer 4 Neurons: 64 Max Weight: 0.072 Min Weight: -0.072 Max Output: 0.000 Min Output: 0.000 TNNetConvolutionReLU 8,8,64 Times: 0.00s 0.00s Parent:3 Layer 5 Neurons: 64 Max Weight: 0.072 Min Weight: -0.072 Max Output: 0.000 Min Output: 0.000 TNNetConvolutionLinear 8,8,64 Times: 0.00s 0.00s Parent:4 Layer 6 Max Output: 0.000 Min Output: 0.000 TNNetMaxPool 2,2,64 Times: 0.00s 0.00s Parent:5 Layer 7 Neurons: 32 Max Weight: 0.144 Min Weight: -0.144 Max Output: 0.000 Min Output: 0.000 TNNetFullConnectReLU 32,1,1 Times: 0.00s 0.00s Parent:6 Layer 8 Neurons: 32 Max Weight: 0.305 Min Weight: -0.306 Max Output: 0.000 Min Output: 0.000 TNNetFullConnectReLU 32,1,1 Times: 0.00s 0.00s Parent:7 Layer 9 Neurons: 10 Max Weight: 0.378 Min Weight: -0.376 Max Output: 0.000 Min Output: 0.000 TNNetFullConnectLinear 10,1,1 Times: 0.00s 0.00s Parent:8 Layer 10 Max Output: 0.000 Min Output: 0.000 TNNetSoftMax 10,1,1 Times: 0.00s 0.00s Parent:9 Neural network has: Layers: 11 Neurons:330 Weights:124928 Sum: -27.002260 Layer 0 Neurons: 0 Weights: 0 TNNetInput(32,32,3,0,0) Output:32,32,3 Learning Rate:0.0010 Inertia:0.90 Weight Sum: 0.0000 Branches:1 Layer 1 Neurons: 64 Weights: 4800 TNNetConvolutionReLU(64,5,2,1,0) Output:32,32,64 Learning Rate:0.0010 Inertia:0.90 Weight Sum: 9.3083 Parent:0 Branches:1 Layer 2 Neurons: 64 Weights: 36864 TNNetConvolutionLinear(64,3,1,1,0) Output:32,32,64 Learning Rate:0.0010 Inertia:0.90 Weight Sum:-16.7340 Parent:1 Branches:1 Layer 3 Neurons: 0 Weights: 0 TNNetMaxPool(4,4,0,0,0) Output:8,8,64 Learning Rate:0.0010 Inertia:0.90 Weight Sum: 0.0000 Parent:2 Branches:1 Layer 4 Neurons: 64 Weights: 36864 TNNetConvolutionReLU(64,3,1,0,0) Output:8,8,64 Learning Rate:0.0010 Inertia:0.90 Weight Sum: -2.0621 Parent:3 Branches:1 Layer 5 Neurons: 64 Weights: 36864 TNNetConvolutionLinear(64,3,1,0,0) Output:8,8,64 Learning Rate:0.0010 Inertia:0.90 Weight Sum: -3.9453 Parent:4 Branches:1 Layer 6 Neurons: 0 Weights: 0 TNNetMaxPool(4,4,0,0,0) Output:2,2,64 Learning Rate:0.0010 Inertia:0.90 Weight Sum: 0.0000 Parent:5 Branches:1 Layer 7 Neurons: 32 Weights: 8192 TNNetFullConnectReLU(32,1,1,0,0) Output:32,1,1 Learning Rate:0.0010 Inertia:0.90 Weight Sum:-12.0966 Parent:6 Branches:1 Layer 8 Neurons: 32 Weights: 1024 TNNetFullConnectReLU(32,1,1,0,0) Output:32,1,1 Learning Rate:0.0010 Inertia:0.90 Weight Sum: 0.9780 Parent:7 Branches:1 Layer 9 Neurons: 10 Weights: 320 TNNetFullConnectLinear(10,1,1,0,0) Output:10,1,1 Learning Rate:0.0010 Inertia:0.90 Weight Sum: -2.4505 Parent:8 Branches:1 Layer 10 Neurons: 0 Weights: 0 TNNetSoftMax(0,0,0,0,0) Output:10,1,1 Learning Rate:0.0010 Inertia:0.90 Weight Sum: 0.0000 Parent:9 Branches:0 Computing... 1280 Examples seen. Accuracy:0.1328 Error: 1.78795 Loss:2.26123 Threads: 8 Thread Forward: 59.10s Thread Backward: 5.81s Time: 44.16s 2560 Examples seen. Accuracy:0.1328 Error: 1.77895 Loss:2.24366 Threads: 8 Thread Forward: 59.45s Thread Backward: 4.66s Time: 42.69s 3840 Examples seen. Accuracy:0.2109 Error: 1.72816 Loss:2.09909 Threads: 8 Thread Forward: 52.07s Thread Backward: 4.02s Time: 40.45s

JeanYvesJonet avatar Feb 09 '21 07:02 JeanYvesJonet

I'm wondering if file neural/neural.cl is currently being cached when running via Delphi... Eventually, I'll have a look.

joaopauloschuler avatar Feb 09 '21 11:02 joaopauloschuler

As reported by PierceNg at the Lazarus forum:

@schuler,

I am happy to report that I have OpenCL Dot Product example running also on my Macbook Pro 2012 model, which has Intel i7 CPU, integrated HD Graphics 4000, and GeForce GT 650M.

Running directly by double-clicking on the app in Finder appears to work - total OpenCL runtime reports 0.02s - but not really. When I run the executable from the shell, the program reports:

File neural.cl could not be found. Error: Failed to create compute kernel:cai_dot_product

I copied neural.cl into the the app bundle directory where the executable opencl-dot-product-test is, then run the executable from the shell again:

clCreateContext OK! clCreateCommandQueue OK! clCreateProgramWithSource OK! clBuildProgram OK! clCreateKernel cai_dot_product OK! clCreateKernel cai_dot_product OK!

Screenshot of successful run attached.

But double clicking on the app from Finder still didn't work even with neural.cl already copied in, as in OpenCL runtime is again 0.02s.

Oh, I also had to fix a FPC_FULLVERSION test in the file Lazarus/components/multithreadprocs/mtpcpu.pas in order to build the program.

joaopauloschuler avatar Mar 11 '21 08:03 joaopauloschuler