aparapi icon indicating copy to clipboard operation
aparapi copied to clipboard

Nvidia RTX 3080 GPU not detected

Open Raunak-Singh-Inventor opened this issue 2 years ago • 16 comments

Hi,

Aparapi is not detecting my gpu even though I have OpenCL installed. When I run these two lines:

Device device = Device.firstGPU();
Range range = device.createRange(size);

It gives me null pointer exception which proves that GPU is not detected.

GPU: Nvidia RTX 3080 OpenCL version: 3.0 Operating System: Pop OS (Ubuntu)

CudaSquare.java:

/*
 * Copyright (C) 2021 - present by EY LLP. and EYQP team. 
 * Any change of "copy" of code without author permission is not allowed.
 * This is NOT a open source project, DONOT copy any Code.
 * For QuantLib please see LICENSCE.TXT provided. 
*/
package cudaProgramming.parallel;

import com.aparapi.Kernel;
import com.aparapi.Range;
import com.aparapi.device.Device;

/**
 * An example Aparapi application which computes and displays squares of a set
 * of 512 input values. While executing on GPU using Aparpi framework, each
 * square value is computed in a separate kernel invocation and can thus
 * maximize performance by optimally utilizing all GPU computing units
 *
 * @author gfrost
 * @version $Id: $Id
 */
public class CudaSquare {

	/**
	 * <p>
	 * main.
	 * </p>
	 *
	 * @param _args an array of {@link java.lang.String} objects.
	 */
	public static void main(String[] _args) {
		// loop must be multiple of 8 or 8 bit
		final int size = 160;

		/** Input float array for which square values need to be computed. */
		final float[] values = new float[size];

		/** Initialize input array. */
		for (int i = 0; i < size; i++) {
			values[i] = i;
		}

		/**
		 * Output array which will be populated with square values of corresponding
		 * input array elements.
		 */
		final float[] squares = new float[size];

		/**
		 * Aparapi Kernel which computes squares of input array elements and populates
		 * them in corresponding elements of output array.
		 **/
		Kernel kernel = new Kernel() {
			@Override
			public void run() {
				int gid = getGlobalId();
				squares[gid] = values[gid] * values[gid];
			}
		};

		Device device = Device.firstGPU();
		Range range = device.createRange(size);

		// Execute Kernel.

		kernel.execute(range);

		// Report target execution mode: GPU or JTP (Java Thread Pool).
		System.out.println("Device = " + kernel.getTargetDevice().getShortDescription());

		// Display computed square values.
		for (int i = 0; i < size; i++) {
			System.out.printf("%6.0f %8.0f\n", values[i], squares[i]);
		}

		// Dispose Kernel resources.
		kernel.dispose();
	}

}

Output of /usr/bin/clinfo:

Number of platforms                               1
  Platform Name                                   NVIDIA CUDA
  Platform Vendor                                 NVIDIA Corporation
  Platform Version                                OpenCL 3.0 CUDA 11.7.89
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_nv_create_buffer cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_device_uuid cl_khr_pci_bus_info cl_khr_external_semaphore cl_khr_external_memory cl_khr_external_semaphore_opaque_fd cl_khr_external_memory_opaque_fd
  Platform Extensions with Version                cl_khr_global_int32_base_atomics                                 0x400000 (1.0.0)
                                                  cl_khr_global_int32_extended_atomics                             0x400000 (1.0.0)
                                                  cl_khr_local_int32_base_atomics                                  0x400000 (1.0.0)
                                                  cl_khr_local_int32_extended_atomics                              0x400000 (1.0.0)
                                                  cl_khr_fp64                                                      0x400000 (1.0.0)
                                                  cl_khr_3d_image_writes                                           0x400000 (1.0.0)
                                                  cl_khr_byte_addressable_store                                    0x400000 (1.0.0)
                                                  cl_khr_icd                                                       0x400000 (1.0.0)
                                                  cl_khr_gl_sharing                                                0x400000 (1.0.0)
                                                  cl_nv_compiler_options                                           0x400000 (1.0.0)
                                                  cl_nv_device_attribute_query                                     0x400000 (1.0.0)
                                                  cl_nv_pragma_unroll                                              0x400000 (1.0.0)
                                                  cl_nv_copy_opts                                                  0x400000 (1.0.0)
                                                  cl_nv_create_buffer                                              0x400000 (1.0.0)
                                                  cl_khr_int64_base_atomics                                        0x400000 (1.0.0)
                                                  cl_khr_int64_extended_atomics                                    0x400000 (1.0.0)
                                                  cl_khr_device_uuid                                               0x400000 (1.0.0)
                                                  cl_khr_pci_bus_info                                              0x400000 (1.0.0)
                                                  cl_khr_external_semaphore                                          0x9000 (0.9.0)
                                                  cl_khr_external_memory                                             0x9000 (0.9.0)
                                                  cl_khr_external_semaphore_opaque_fd                                0x9000 (0.9.0)
                                                  cl_khr_external_memory_opaque_fd                                   0x9000 (0.9.0)
  Platform Numeric Version                        0xc00000 (3.0.0)
  Platform Extensions function suffix             NV
  Platform Host timer resolution                  0ns

  Platform Name                                   NVIDIA CUDA
Number of devices                                 1
  Device Name                                     NVIDIA GeForce RTX 3080 Laptop GPU
  Device Vendor                                   NVIDIA Corporation
  Device Vendor ID                                0x10de
  Device Version                                  OpenCL 3.0 CUDA
  Device UUID                                     808d8c70-1aef-49f3-cc51-44aaea760720
  Driver UUID                                     808d8c70-1aef-49f3-cc51-44aaea760720
  Valid Device LUID                               No
  Device LUID                                     6d69-637300000000
  Device Node Mask                                0
  Device Numeric Version                          0xc00000 (3.0.0)
  Driver Version                                  515.48.07
  Device OpenCL C Version                         OpenCL C 1.2 
  Device OpenCL C all versions                    OpenCL C                                                         0x400000 (1.0.0)
                                                  OpenCL C                                                         0x401000 (1.1.0)
                                                  OpenCL C                                                         0x402000 (1.2.0)
                                                  OpenCL C                                                         0xc00000 (3.0.0)
  Device OpenCL C features                        __opencl_c_fp64                                                  0xc00000 (3.0.0)
                                                  __opencl_c_images                                                0xc00000 (3.0.0)
                                                  __opencl_c_int64                                                 0xc00000 (3.0.0)
                                                  __opencl_c_3d_image_writes                                       0xc00000 (3.0.0)
  Latest comfornace test passed                   v2021-02-01-00
  Device Type                                     GPU
  Device Topology (NV)                            PCI-E, 0000:01:00.0
  Device Profile                                  FULL_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Max compute units                               48
  Max clock frequency                             1710MHz
  Compute Capability (NV)                         8.6
  Device Partition                                (core)
    Max number of sub-devices                     1
    Supported partition types                     None
    Supported affinity domains                    (n/a)
  Max work item dimensions                        3
  Max work item sizes                             1024x1024x64
  Max work group size                             1024
  Preferred work group size multiple (device)     32
  Preferred work group size multiple (kernel)     32
  Warp size (NV)                                  32
  Max sub-groups per work group                   0
  Preferred / native vector sizes                 
    char                                                 1 / 1       
    short                                                1 / 1       
    int                                                  1 / 1       
    long                                                 1 / 1       
    half                                                 0 / 0        (n/a)
    float                                                1 / 1       
    double                                               1 / 1        (cl_khr_fp64)
  Half-precision Floating-point support           (n/a)
  Single-precision Floating-point support         (core)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  Yes
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
  Address bits                                    64, Little-Endian
  Global memory size                              16908353536 (15.75GiB)
  Error Correction support                        No
  Max memory allocation                           4227088384 (3.937GiB)
  Unified memory for Host and Device              No
  Integrated memory (NV)                          No
  Shared Virtual Memory (SVM) capabilities        (core)
    Coarse-grained buffer sharing                 Yes
    Fine-grained buffer sharing                   No
    Fine-grained system sharing                   No
    Atomics                                       No
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       4096 bits (512 bytes)
  Preferred alignment for atomics                 
    SVM                                           0 bytes
    Global                                        0 bytes
    Local                                         0 bytes
  Atomic memory capabilities                      relaxed, work-group scope
  Atomic fence capabilities                       relaxed, acquire/release, work-group scope
  Max size for global variable                    0
  Preferred total size of global vars             0
  Global Memory cache type                        Read/Write
  Global Memory cache size                        1376256 (1.312MiB)
  Global Memory cache line size                   128 bytes
  Image support                                   Yes
    Max number of samplers per kernel             32
    Max size for 1D images from buffer            268435456 pixels
    Max 1D or 2D image array size                 2048 images
    Max 2D image size                             32768x32768 pixels
    Max 3D image size                             16384x16384x16384 pixels
    Max number of read image args                 256
    Max number of write image args                32
    Max number of read/write image args           0
  Pipe support                                    No
  Max number of pipe args                         0
  Max active pipe reservations                    0
  Max pipe packet size                            0
  Local memory type                               Local
  Local memory size                               49152 (48KiB)
  Registers per block (NV)                        65536
  Max number of constant args                     9
  Max constant buffer size                        65536 (64KiB)
  Generic address space support                   No
  Max size of kernel argument                     4352 (4.25KiB)
  Queue properties (on host)                      
    Out-of-order execution                        Yes
    Profiling                                     Yes
  Device enqueue capabilities                     (n/a)
  Queue properties (on device)                    
    Out-of-order execution                        No
    Profiling                                     No
    Preferred size                                0
    Max size                                      0
  Max queues on device                            0
  Max events on device                            0
  Prefer user sync for interop                    No
  Profiling timer resolution                      1000ns
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            No
    Non-uniform work-groups                       No
    Work-group collective functions               No
    Sub-group independent forward progress        No
    Kernel execution timeout (NV)                 Yes
  Concurrent copy and kernel execution (NV)       Yes
    Number of async copy engines                  2
    IL version                                    (n/a)
    ILs with version                              <printDeviceInfo:186: get CL_DEVICE_ILS_WITH_VERSION : error -30>
  printf() buffer size                            1048576 (1024KiB)
  Built-in kernels                                (n/a)
  Built-in kernels with version                   <printDeviceInfo:190: get CL_DEVICE_BUILT_IN_KERNELS_WITH_VERSION : error -30>
  Device Extensions                               cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_nv_create_buffer cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_device_uuid cl_khr_pci_bus_info cl_khr_external_semaphore cl_khr_external_memory cl_khr_external_semaphore_opaque_fd cl_khr_external_memory_opaque_fd
  Device Extensions with Version                  cl_khr_global_int32_base_atomics                                 0x400000 (1.0.0)
                                                  cl_khr_global_int32_extended_atomics                             0x400000 (1.0.0)
                                                  cl_khr_local_int32_base_atomics                                  0x400000 (1.0.0)
                                                  cl_khr_local_int32_extended_atomics                              0x400000 (1.0.0)
                                                  cl_khr_fp64                                                      0x400000 (1.0.0)
                                                  cl_khr_3d_image_writes                                           0x400000 (1.0.0)
                                                  cl_khr_byte_addressable_store                                    0x400000 (1.0.0)
                                                  cl_khr_icd                                                       0x400000 (1.0.0)
                                                  cl_khr_gl_sharing                                                0x400000 (1.0.0)
                                                  cl_nv_compiler_options                                           0x400000 (1.0.0)
                                                  cl_nv_device_attribute_query                                     0x400000 (1.0.0)
                                                  cl_nv_pragma_unroll                                              0x400000 (1.0.0)
                                                  cl_nv_copy_opts                                                  0x400000 (1.0.0)
                                                  cl_nv_create_buffer                                              0x400000 (1.0.0)
                                                  cl_khr_int64_base_atomics                                        0x400000 (1.0.0)
                                                  cl_khr_int64_extended_atomics                                    0x400000 (1.0.0)
                                                  cl_khr_device_uuid                                               0x400000 (1.0.0)
                                                  cl_khr_pci_bus_info                                              0x400000 (1.0.0)
                                                  cl_khr_external_semaphore                                          0x9000 (0.9.0)
                                                  cl_khr_external_memory                                             0x9000 (0.9.0)
                                                  cl_khr_external_semaphore_opaque_fd                                0x9000 (0.9.0)
                                                  cl_khr_external_memory_opaque_fd                                   0x9000 (0.9.0)

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  No platform
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   No platform
  clCreateContext(NULL, ...) [default]            No platform
  clCreateContext(NULL, ...) [other]              Success [NV]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  Invalid device type for platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  No platform

Raunak-Singh-Inventor avatar Jun 26 '22 23:06 Raunak-Singh-Inventor

Raunak....

I looked at the Aparapi code here

https://github.com/Syncleus/aparapi-native/blob/bdfdb0b45c8a5105907500ec5cee701d92e35a19/src/cpp/invoke/OpenCLJNI.cpp#L472

Even though you have OpenCL 3.0 you should still see the message/warning

"Aparapi is running on an untested OpenCL platform version"

Have you built an OpenCL native program on this machine and had it actually execute?

Do you have experience compiling C++ code on Linux from cmake?

grfrost avatar Jun 28 '22 13:06 grfrost

@grfrost I have experience compiling with cmake on Linux.

Can you please give me the instructions to install supported version of OpenCL?

Raunak-Singh-Inventor avatar Jun 28 '22 13:06 Raunak-Singh-Inventor

You probably cannot get an earlier version.

But aparapi should work on yours.

You could pull the 3 maven repos and build yourself. That way we can add diagnostics in the native code to see what is going wrong.

Back top my earlier question. Have you built (and ran!) your own OpenCL C/Cpp code on this machine. I want to make sure all the shared libs are in place.

grfrost avatar Jun 28 '22 15:06 grfrost

@grfrost No I have not ran my own OpenCL code. Do you have instructions where I can do that.

Also, what are the 3 maven repos you are asking me to build.

Thanks for your help.

Raunak-Singh-Inventor avatar Jun 28 '22 17:06 Raunak-Singh-Inventor

Ok I just created a public github repo.

Can you clone this and build (cmake) and see if the build and runjava.sh script work for you?

https://github.com/grfrost/javacltest

On Tue, Jun 28, 2022 at 6:02 PM Raunak Singh @.***> wrote:

@grfrost https://github.com/grfrost No I have not ran my own OpenCL code. Do you have instructions where I can do that.

Also, what are the 3 maven repos you are asking me to build.

Thanks for your help.

— Reply to this email directly, view it on GitHub https://github.com/Syncleus/aparapi/issues/168#issuecomment-1168993074, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABBKEN4M2NDCPS7ICPHGKETVRMVZRANCNFSM5Z4VNVOQ . You are receiving this because you were mentioned.Message ID: @.***>

grfrost avatar Jun 30 '22 11:06 grfrost

@grfrost Thanks for the repository link. I ran it and got this output from bash runjava.sh:

There is 1 platform
platform 0{
   CL_PLATFORM_VENDOR.."NVIDIA Corporation"
   CL_PLATFORM_VERSION."OpenCL 3.0 CUDA 11.7.89"
   CL_PLATFORM_NAME...."NVIDIA CUDA"
   Platform 0 has 1 device{
      Device 0{
         CL_DEVICE_TYPE..................... GPU (0x0) 
         CL_DEVICE_MAX_COMPUTE_UNITS........ 48
         CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS. 3
             dim[0] = 1024
             dim[1] = 1024
             dim[2] = 64
         CL_DEVICE_MAX_WORK_GROUP_SIZE...... 1024
         CL_DEVICE_MAX_MEM_ALLOC_SIZE....... 4227088384
         CL_DEVICE_GLOBAL_MEM_SIZE.......... 16908353536
         CL_DEVICE_LOCAL_MEM_SIZE........... 49152
         CL_DEVICE_PROFILE.................. FULL_PROFILE
         CL_DEVICE_VERSION.................. OpenCL 3.0 CUDA
         CL_DRIVER_VERSION.................. 515.48.07
         CL_DEVICE_OPENCL_C_VERSION......... OpenCL C 1.2 
         CL_DEVICE_NAME..................... NVIDIA GeForce RTX 3080 Laptop GPU
         CL_DEVICE_EXTENSIONS............... cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_nv_create_buffer cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_device_uuid cl_khr_pci_bus_info cl_khr_external_semaphore cl_khr_external_memory cl_khr_external_semaphore_opaque_fd cl_khr_external_memory_opaque_fd
         CL_DEVICE_BUILT_IN_KERNELS......... 
      }
   }
}

Raunak-Singh-Inventor avatar Jun 30 '22 21:06 Raunak-Singh-Inventor

OK so java can call jni and opencl on this machine, Aparapi should run just fine.

I suspect that the issue is that for some reason on this platform the OpenCL shared lib is not on the default path/LD_LIBRARY_PATH

We need to find libOpenCL and either add it to LD_LIBRARY_PATH or tell java where this library is when we run Aparapi.

So first lets find your libOpenCL.so

We will use linux ldd command to query the dynamic libraries loaded by your clinfo tool and note the directory where libOpenCL.so is

$ ldd /usr/bin/clinfo linux-vdso.so.1 (0x00007ffd439b4000) libOpenCL.so.1 => /opt/rocm/lib/libOpenCL.so.1 (0x00007f504df13000) libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f504def2000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f504dd00000) libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f504dcdd000) /lib64/ld-linux-x86-64.so.2 (0x00007f504e139000)

So my libOpenCL.so (mine is from AMD) is in /opt/rocm/lib

Once you have located libOpenCL you need to add that dir to to your java command line using -Djava.library.path=/opt/rocm/lib (or whetever your path is)

$ java -Djava.library.path=/opt/rocm/lib ... other Aparapi options

Does that make sense?

On Thu, Jun 30, 2022 at 10:03 PM Raunak Singh @.***> wrote:

@grfrost https://github.com/grfrost Thanks for the repository link. I ran it and got this output from bash runjava.sh:

There is 1 platform platform 0{ CL_PLATFORM_VENDOR.."NVIDIA Corporation" CL_PLATFORM_VERSION."OpenCL 3.0 CUDA 11.7.89" CL_PLATFORM_NAME...."NVIDIA CUDA" Platform 0 has 1 device{ Device 0{ CL_DEVICE_TYPE..................... GPU (0x0) CL_DEVICE_MAX_COMPUTE_UNITS........ 48 CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS. 3 dim[0] = 1024 dim[1] = 1024 dim[2] = 64 CL_DEVICE_MAX_WORK_GROUP_SIZE...... 1024 CL_DEVICE_MAX_MEM_ALLOC_SIZE....... 4227088384 CL_DEVICE_GLOBAL_MEM_SIZE.......... 16908353536 CL_DEVICE_LOCAL_MEM_SIZE........... 49152 CL_DEVICE_PROFILE.................. FULL_PROFILE CL_DEVICE_VERSION.................. OpenCL 3.0 CUDA CL_DRIVER_VERSION.................. 515.48.07 CL_DEVICE_OPENCL_C_VERSION......... OpenCL C 1.2 CL_DEVICE_NAME..................... NVIDIA GeForce RTX 3080 Laptop GPU CL_DEVICE_EXTENSIONS............... cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_nv_create_buffer cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_device_uuid cl_khr_pci_bus_info cl_khr_external_semaphore cl_khr_external_memory cl_khr_external_semaphore_opaque_fd cl_khr_external_memory_opaque_fd CL_DEVICE_BUILT_IN_KERNELS......... } } }

— Reply to this email directly, view it on GitHub https://github.com/Syncleus/aparapi/issues/168#issuecomment-1171674361, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABBKEN5ECBZPSTTBJTGHHQLVRYDTRANCNFSM5Z4VNVOQ . You are receiving this because you were mentioned.Message ID: @.***>

grfrost avatar Jul 01 '22 09:07 grfrost

here is the output

(base) rauna@pop-os:~$ ldd /usr/bin/clinfo linux-vdso.so.1 (0x00007ffd9fbce000) libOpenCL.so.1 => /usr/local/cuda-11.7/lib64/libOpenCL.so.1 (0x00007fe000600000) libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fe0008f0000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fe0003d8000) libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fe0008eb000) /lib64/ld-linux-x86-64.so.2 (0x00007fe000939000)

Raunak-Singh-Inventor avatar Jul 04 '22 10:07 Raunak-Singh-Inventor

@grfrost after passing the command line argument I am stil getting the same error:

/home/rauna/.jdks/openjdk-18.0.1.1/bin/java -javaagent:/home/rauna/.local/share/JetBrains/Toolbox/apps/IDEA-C/ch-0/221.5921.22/lib/idea_rt.jar=42963:/home/rauna/.local/share/JetBrains/Toolbox/apps/IDEA-C/ch-0/221.5921.22/bin -Dfile.encoding=UTF-8 -classpath /home/rauna/Documents/cudaProgramming/target/classes:/home/rauna/.m2/repository/com/aparapi/aparapi/2.0.0/aparapi-2.0.0.jar:/home/rauna/.m2/repository/com/aparapi/aparapi-jni/1.4.2/aparapi-jni-1.4.2.jar:/home/rauna/.m2/repository/org/apache/bcel/bcel/6.4.1/bcel-6.4.1.jar:/home/rauna/.m2/repository/org/scala-lang/scala-library/2.13.1/scala-library-2.13.1.jar cudaProgramming.parallel.CudaSquare -Djava.library.path=/usr/local/cuda-11.7/lib64/
Exception in thread "main" java.lang.NullPointerException: Cannot invoke "com.aparapi.device.Device.createRange(int)" because "device" is null
	at cudaProgramming.parallel.CudaSquare.main(CudaSquare.java:62)

Process finished with exit code 1

Raunak-Singh-Inventor avatar Jul 04 '22 11:07 Raunak-Singh-Inventor

I replied to your email with pictures. ;) in case you don't get that.

I think you added -Djava.library.path=...... AFTER your class name

-Dxxxxx options are intended for the JVM not params to your code

So they appear before the class you are using

You have /home/rauna/.jdks/openjdk-18.0.1.1/bin/java ...../scala-library-2.13.1.jar cudaProgramming.parallel.CudaSquare -Djava.library.path=/usr/local/cuda-11.7/lib64/

You want /home/rauna/.jdks/openjdk-18.0.1.1/bin/java ...../scala-library-2.13.1.jar -Djava.library.path=/usr/local/cuda-11.7/lib64/ cudaProgramming.parallel.CudaSquare

In intellij don't add -Dxxxx as a program argument add it as a jvmoption

So in run config

image.png

Click on modify options and then vmoptions

image.png

Then in the new 'VM Options' text field

image.png Add -Djava.library.path=/usr/local/cuda-11.7/lib64/ image.png

That should do it ;). hopefully

grfrost avatar Jul 04 '22 13:07 grfrost

@grfrost now I added the java argument to the JVM as you said but I am stil getting the same error. please assist:

/home/rauna/.jdks/openjdk-18.0.1.1/bin/java -Djava.library.path=/usr/local/cuda-11.7/lib64/ -javaagent:/home/rauna/.local/share/JetBrains/Toolbox/apps/IDEA-C/ch-0/221.5921.22/lib/idea_rt.jar=43091:/home/rauna/.local/share/JetBrains/Toolbox/apps/IDEA-C/ch-0/221.5921.22/bin -Dfile.encoding=UTF-8 -classpath /home/rauna/Documents/cudaProgramming/target/classes:/home/rauna/.m2/repository/com/aparapi/aparapi/2.0.0/aparapi-2.0.0.jar:/home/rauna/.m2/repository/com/aparapi/aparapi-jni/1.4.2/aparapi-jni-1.4.2.jar:/home/rauna/.m2/repository/org/apache/bcel/bcel/6.4.1/bcel-6.4.1.jar:/home/rauna/.m2/repository/org/scala-lang/scala-library/2.13.1/scala-library-2.13.1.jar cudaProgramming.parallel.CudaSquare
Exception in thread "main" java.lang.NullPointerException: Cannot invoke "com.aparapi.device.Device.createRange(int)" because "device" is null
	at cudaProgramming.parallel.CudaSquare.main(CudaSquare.java:62)

Note: I can't see the images on my end. I just see text of image.png

Raunak-Singh-Inventor avatar Jul 04 '22 14:07 Raunak-Singh-Inventor

OK I don't run using intellij.

Two things to try.

  1. Try getting rid of the intellij stuff from the above command line and launch from the command line

/home/rauna/.jdks/openjdk-18.0.1.1/bin/java -Djava.library.path=/usr/local/cuda-11.7/lib64/ -classpath /home/rauna/Documents/cudaProgramming/target/classes:/home/rauna/.m2/repository/com/aparapi/aparapi/2.0.0/aparapi-2.0.0.jar:/home/rauna/.m2/repository/com/aparapi/aparapi-jni/1.4.2/aparapi-jni-1.4.2.jar:/home/rauna/.m2/repository/org/apache/bcel/bcel/6.4.1/bcel-6.4.1.jar:/home/rauna/.m2/repository/org/scala-lang/scala-library/2.13.1/scala-library-2.13.1.jar cudaProgramming.parallel.CudaSquare

  1. set LD_LIBRARY_PATH (intellij probably has a way to set an environment variable at launch).

LD_LIBRARY_PATH=/usr/local/cuda-11.7/lib64/ /home/rauna/.jdks/openjdk-18.0.1.1/bin/java -Djava.library.path=/usr/local/cuda-11.7/lib64/ -classpath /home/rauna/Documents/cudaProgramming/target/classes:/home/rauna/.m2/ repository/com/aparapi/aparapi/2.0.0/aparapi-2.0.0.jar:/home/rauna/.m2/ repository/com/aparapi/aparapi-jni/1.4.2/aparapi-jni- 1.4.2.jar:/home/rauna/.m2/repository/org/apache/bcel/ bcel/6.4.1/bcel-6.4.1.jar:/home/rauna/.m2/repository/org/ scala-lang/scala-library/2.13.1/scala-library-2.13.1.jar cudaProgramming.parallel.CudaSquare I left the Scala stuff, but worry a bit about why it is there?

Gary

On Mon, Jul 4, 2022 at 3:24 PM Raunak Singh @.***> wrote:

@grfrost https://github.com/grfrost now I added the java argument to the JVM as you said but I am stil getting the same error. please assist:

/home/rauna/.jdks/openjdk-18.0.1.1/bin/java -Djava.library.path=/usr/local/cuda-11.7/lib64/ -javaagent:/home/rauna/.local/share/JetBrains/Toolbox/apps/IDEA-C/ch-0/221.5921.22/lib/idea_rt.jar=43091:/home/rauna/.local/share/JetBrains/Toolbox/apps/IDEA-C/ch-0/221.5921.22/bin -Dfile.encoding=UTF-8 -classpath /home/rauna/Documents/cudaProgramming/target/classes:/home/rauna/.m2/repository/com/aparapi/aparapi/2.0.0/aparapi-2.0.0.jar:/home/rauna/.m2/repository/com/aparapi/aparapi-jni/1.4.2/aparapi-jni-1.4.2.jar:/home/rauna/.m2/repository/org/apache/bcel/bcel/6.4.1/bcel-6.4.1.jar:/home/rauna/.m2/repository/org/scala-lang/scala-library/2.13.1/scala-library-2.13.1.jar cudaProgramming.parallel.CudaSquare Exception in thread "main" java.lang.NullPointerException: Cannot invoke "com.aparapi.device.Device.createRange(int)" because "device" is null at cudaProgramming.parallel.CudaSquare.main(CudaSquare.java:62)

Note: I can't see the images on my end. I just see text of image.png

— Reply to this email directly, view it on GitHub https://github.com/Syncleus/aparapi/issues/168#issuecomment-1173880115, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABBKEN7NBT2T4U46UJRNZQDVSLXYDANCNFSM5Z4VNVOQ . You are receiving this because you were mentioned.Message ID: @.***>

grfrost avatar Jul 04 '22 14:07 grfrost

Suggestion 1) Please give me clear starightforward instructions as I don't have that much experience compiling in Maven from command line. I rely heavily on IntelliJ

Suggestion 2) didn't work even though I set the env path here: image

Raunak-Singh-Inventor avatar Jul 06 '22 01:07 Raunak-Singh-Inventor

I have done a lot of work to try to help you.

Please read the msg i put on the aparapi gitlab page

@mandeepsingh-private you might be able to rebuild and test using

https://github.com/grfrost/aparapi- https://github.com/grfrost/aparapi-m1 builder-cmake

Build and run works for me on Apple M1 and linux x64 platforms. So hopefully it will find you NVidia CUDA OpenCL lib....

I don't generally use intellij

I have never used nvidia cards

You are going to need to be able to use the command line to debug this...

So now is your chance to learn

On Wed, Jul 6, 2022 at 2:36 AM Raunak Singh @.***> wrote:

Suggestion 1) Please give me clear starightforward instructions as I don't have that much experience compiling in Maven from command line. I rely heavily on IntelliJ

Suggestion 2) didn't work even though I set the env path here: [image: image] https://user-images.githubusercontent.com/46200816/177446684-23491aca-8035-4aee-acb5-e5356b6c2e83.png

— Reply to this email directly, view it on GitHub https://github.com/Syncleus/aparapi/issues/168#issuecomment-1175679406, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABBKEN7SMLP7ONV4FY2KE6TVSTPJXANCNFSM5Z4VNVOQ . You are receiving this because you were mentioned.Message ID: @.***>

grfrost avatar Jul 06 '22 07:07 grfrost

@grfrost I tried using CUDA C++ and it is working for me. Thanks for all of your help, I will look at this again at a later date when I'm a little more experienced :)

Raunak-Singh-Inventor avatar Jul 15 '22 20:07 Raunak-Singh-Inventor

Excellent, cuda has some great tool support. It is a great way to learn the fundamentals

Gary

On Fri, 15 Jul 2022 at 21:07, Raunak Singh @.***> wrote:

@grfrost https://github.com/grfrost I tried using CUDA C++ and it is working for me. Thanks for all of your help, I will look at this again at a later date when I'm a little more experienced :)

— Reply to this email directly, view it on GitHub https://github.com/Syncleus/aparapi/issues/168#issuecomment-1185872954, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABBKENYR7QUISVLM352EESLVUHAJZANCNFSM5Z4VNVOQ . You are receiving this because you were mentioned.Message ID: @.***>

grfrost avatar Jul 16 '22 07:07 grfrost

My bad ;)

-Dxxxxx options are intended for the JVM not params to your code

So they appear before the class you are using

You have /home/rauna/.jdks/openjdk-18.0.1.1/bin/java ...../scala-library-2.13.1.jar cudaProgramming.parallel.CudaSquare -Djava.library.path=/usr/local/cuda-11.7/lib64/

You want /home/rauna/.jdks/openjdk-18.0.1.1/bin/java ...../scala-library-2.13.1.jar * -Djava.library.path=/usr/local/cuda-11.7/lib64/ *cudaProgramming.parallel.CudaSquare

In intellij don't add -Dxxxx as a program argument add it as a jvmoption So in run config [image: image.png] Click on modify options and then vmoptions

[image: image.png]

Then in the new 'VM Options' text field

[image: image.png] Add -Djava.library.path=/usr/local/cuda-11.7/lib64/ [image: image.png]

That should do it ;). hopefully

On Mon, Jul 4, 2022 at 12:28 PM Raunak Singh @.***> wrote:

@grfrost https://github.com/grfrost after passing the command line argument I am stil getting the same error:

/home/rauna/.jdks/openjdk-18.0.1.1/bin/java -javaagent:/home/rauna/.local/share/JetBrains/Toolbox/apps/IDEA-C/ch-0/221.5921.22/lib/idea_rt.jar=42963:/home/rauna/.local/share/JetBrains/Toolbox/apps/IDEA-C/ch-0/221.5921.22/bin -Dfile.encoding=UTF-8 -classpath /home/rauna/Documents/cudaProgramming/target/classes:/home/rauna/.m2/repository/com/aparapi/aparapi/2.0.0/aparapi-2.0.0.jar:/home/rauna/.m2/repository/com/aparapi/aparapi-jni/1.4.2/aparapi-jni-1.4.2.jar:/home/rauna/.m2/repository/org/apache/bcel/bcel/6.4.1/bcel-6.4.1.jar:/home/rauna/.m2/repository/org/scala-lang/scala-library/2.13.1/scala-library-2.13.1.jar cudaProgramming.parallel.CudaSquare -Djava.library.path=/usr/local/cuda-11.7/lib64/ Exception in thread "main" java.lang.NullPointerException: Cannot invoke "com.aparapi.device.Device.createRange(int)" because "device" is null at cudaProgramming.parallel.CudaSquare.main(CudaSquare.java:62)

Process finished with exit code 1

— Reply to this email directly, view it on GitHub https://github.com/Syncleus/aparapi/issues/168#issuecomment-1173706313, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABBKEN2JFYTO5DOVXOCZTZ3VSLDF7ANCNFSM5Z4VNVOQ . You are receiving this because you were mentioned.Message ID: @.***>

grfrost avatar Oct 11 '22 09:10 grfrost

I found myself with the same problem on windows with an RTX 3060.

The website says that "It is no longer required to manually install the Aparapi JNI native interface, this is now done automatically through maven as a dependency on Aparapi."

I thought that meant this was the only needed dependency:

<dependency>
       <groupId>com.aparapi</groupId>
       <artifactId>aparapi</artifactId>
      <version>2.0.0</version>    
</dependency>

However I also needed the dependency on aparapi-jni:

<dependency>
       <groupId>com.aparapi</groupId>
      <artifactId>aparapi-jni</artifactId>
      <version>1.4.3</version>
</dependency>

which fixed everything. just in case someone has a similar issue..

ebeaufay avatar Nov 30 '23 11:11 ebeaufay