kNNcuda icon indicating copy to clipboard operation
kNNcuda copied to clipboard

illegal memory access

Open itabhiyanta opened this issue 7 years ago • 17 comments

Hi

I am trying to use the Example binary that is created when i compile your code (which uses main.cpp). I have changed the location of the .ply file to my point cloud with 2M points but i get the following error.

kNNcuda/build$ ./Example Reading file /home/rohit/Documents/bruteforceNN/kNNcuda/src/xyz.ply ... Finished reading file. 2000000 Constructing kd-tree... Finished kd-tree construction. Start Normal Calculation... an illegal memory access was encountered in /home/rohit/Documents/bruteforceNN/kNNcuda/src/calcNormalsCuda.cu at line 912

Even if I give it a smaller point cloud say 30k it still gives me the same error at the same line

itabhiyanta avatar May 28 '17 06:05 itabhiyanta

Hi itabhiyanta,

Thanks for testing the software. We tested it with different scans each up to 60000000 points and the illegal memory access never happend. That's very interesting. Maybe you can send me your pointcloud to fix this issue? Additionally, you could give me information about your hardware (gpu) and driver installations.

Greetings Alex

amock avatar May 30 '17 09:05 amock

Hi Alex

Here you are. i have attached the main.cpp which compiles to Example and then also the sample point cloud i have. it is having only 30000 points.

I have a Nvidia quadro M4000 with about 8gb memory.

Thanks a lot for your prompt response. rohit

main_simple.zip

itabhiyanta avatar May 30 '17 09:05 itabhiyanta

I tried to estimate normals with your point cloud and it works. No memory error. Even not with cuda_memcheck. Can you try the cuda_memcheck or cuda_gdb? Maybe it gives us better messages where the code fails on your gpu. And have you installed cuda8.0? snapshot00. Here is a snapshot of the resulting normals.

amock avatar May 30 '17 10:05 amock

i canged the compiling from a specific architecture to a general architecture. Maybe that can solve your problem. Just pull the master again.

amock avatar May 30 '17 11:05 amock

Hi Alexander

I did pull the master and again rain ./Example from the build directory and prior to this changed the path to the .ply file to the location where i kept my 30k point cloud. I still get the same error

I checked the output of cmake and it seems to be OK. Something in my environment is not OK.

I also tried to remotely connect to my desktop (i thought maybe because the GPU is also being used for rendering display, there is this error) and then ran the example but that also gives the same error.

Reading file /home/rohit/Documents/bruteforceNN/kNNcuda/src/simple.ply ... Finished reading file. 30000 Constructing kd-tree... Finished kd-tree construction. Start Normal Calculation... an illegal memory access was encountered in /home/rohit/Documents/bruteforceNN/kNNcuda/src/calcNormalsCuda.cu at line 912

I guess you also tried running just the ./Example from the command line right?

with kind regards Rohit

On Tue, May 30, 2017 at 1:22 PM, Alexander Mock [email protected] wrote:

i canged the compiling from a specific architecture to a general architecture. Maybe that can solve your problem. Just pull the master again.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/aock/kNNcuda/issues/1#issuecomment-304848904, or mute the thread https://github.com/notifications/unsubscribe-auth/AHrFxv_LvsCXB8TMYFlsWfx2eVLYLC7Wks5r-_vngaJpZM4Noj_3 .

itabhiyanta avatar May 30 '17 11:05 itabhiyanta

when i run with memcheck i get this output

rohit@rohit-desktop-work:~/Documents/bruteforceNN/kNNcuda/build$ cuda-memcheck ./Example ========= CUDA-MEMCHECK Reading file /home/rohit/Documents/bruteforceNN/kNNcuda/src/simple.ply ... Finished reading file. 30000 Constructing kd-tree... Finished kd-tree construction. Start Normal Calculation... an illegal memory access was encountered in /home/rohit/Documents/bruteforceNN/kNNcuda/src/calcNormalsCuda.cu at line 912 ========= Internal error (7) ========= No CUDA-MEMCHECK results found

On Tue, May 30, 2017 at 1:31 PM, Rohit Gupta [email protected] wrote:

Hi Alexander

I did pull the master and again rain ./Example from the build directory and prior to this changed the path to the .ply file to the location where i kept my 30k point cloud. I still get the same error

I checked the output of cmake and it seems to be OK. Something in my environment is not OK.

I also tried to remotely connect to my desktop (i thought maybe because the GPU is also being used for rendering display, there is this error) and then ran the example but that also gives the same error.

Reading file /home/rohit/Documents/bruteforceNN/kNNcuda/src/simple.ply ... Finished reading file. 30000 Constructing kd-tree... Finished kd-tree construction. Start Normal Calculation... an illegal memory access was encountered in /home/rohit/Documents/ bruteforceNN/kNNcuda/src/calcNormalsCuda.cu at line 912

I guess you also tried running just the ./Example from the command line right?

with kind regards Rohit

On Tue, May 30, 2017 at 1:22 PM, Alexander Mock [email protected] wrote:

i canged the compiling from a specific architecture to a general architecture. Maybe that can solve your problem. Just pull the master again.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/aock/kNNcuda/issues/1#issuecomment-304848904, or mute the thread https://github.com/notifications/unsubscribe-auth/AHrFxv_LvsCXB8TMYFlsWfx2eVLYLC7Wks5r-_vngaJpZM4Noj_3 .

itabhiyanta avatar May 30 '17 11:05 itabhiyanta

okay i was able to see something i guess is useful using cuda-gdb

rohit@rohit-desktop-work:~/Documents/bruteforceNN/kNNcuda/build$ cuda-gdb ./Example NVIDIA (R) CUDA Debugger 8.0 release Portions Copyright (C) 2007-2016 NVIDIA Corporation GNU gdb (GDB) 7.6.2 Copyright (C) 2013 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-unknown-linux-gnu". For bug reporting instructions, please see: http://www.gnu.org/software/gdb/bugs/... Reading symbols from /home/rohit/Documents/bruteforceNN/kNNcuda/build/Example...done. (cuda-gdb) run ./Example Starting program: /home/rohit/Documents/bruteforceNN/kNNcuda/build/./Example ./Example [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Reading file /home/rohit/Documents/bruteforceNN/kNNcuda/src/simple.ply ... Finished reading file. 30000 Constructing kd-tree... [New Thread 0x7ffff5f02700 (LWP 30644)] Finished kd-tree construction. Start Normal Calculation... [New Thread 0x7ffff5701700 (LWP 30645)] [New Thread 0x7ffff4f00700 (LWP 30646)]

CUDA Exception: Warp Illegal Address The exception was triggered at PC 0xc0a4d8

Program received signal CUDA_EXCEPTION_14, Warp Illegal Address. [Switching focus to CUDA kernel 0, grid 1, block (13,0,0), thread (224,0,0), device 0, sm 0, warp 36, lane 0] 0x0000000000c0a4f0 in KNNKernel(PointArray, PointArray, PointArray, int, int) <<<(30,1,1),(1024,1,1)>>> ()

itabhiyanta avatar May 30 '17 11:05 itabhiyanta

can you show me the output of your deviceQuery? It's usally located in /usr/local/cuda/samples/1_Utilities/deviceQuery

amock avatar May 30 '17 12:05 amock

rohit@rohit-desktop-work:/usr/local/cuda-8.0/samples/bin/x86_64/linux/release$ . /deviceQuery ./deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "Quadro M4000" CUDA Driver Version / Runtime Version 8.0 / 8.0 CUDA Capability Major/Minor version number: 5.2 Total amount of global memory: 8119 MBytes (8513716224 bytes) (13) Multiprocessors, (128) CUDA Cores/MP: 1664 CUDA Cores GPU Max Clock rate: 773 MHz (0.77 GHz) Memory Clock rate: 3005 Mhz Memory Bus Width: 256-bit L2 Cache Size: 2097152 bytes Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096) Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total number of registers available per block: 65536 Warp size: 32 Maximum number of threads per multiprocessor: 2048 Maximum number of threads per block: 1024 Max dimension size of a thread block (x,y,z): (1024, 1024, 64) Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535) Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and kernel execution: Yes with 2 copy engine(s) Run time limit on kernels: No Integrated GPU sharing Host Memory: No Support host page-locked memory mapping: Yes Alignment requirement for Surfaces: Yes Device has ECC support: Disabled Device supports Unified Addressing (UVA): Yes Device PCI Domain ID / Bus ID / location ID: 0 / 2 / 0 Compute Mode: < Default (multiple host threads can use ::cudaSetDevice() with device simu ltaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 8.0, CUDA Runtime Versi on = 8.0, NumDevs = 1, Device0 = Quadro M4000 Result = PASS

On Tue, May 30, 2017 at 2:02 PM, Alexander Mock [email protected] wrote:

can you show me the output of your deviceQuery? It's usally located in /usr/local/cuda/samples/1_Utilities/deviceQuery

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/aock/kNNcuda/issues/1#issuecomment-304857126, or mute the thread https://github.com/notifications/unsubscribe-auth/AHrFxuS2dNLV580Yx-KoKOZwBSeXEIKFks5r_AVUgaJpZM4Noj_3 .

itabhiyanta avatar May 30 '17 12:05 itabhiyanta

I created a warp-fix branch. Please try it out. It's hard to figure out what is going wrong when my cuda-gdb output say it is everything okay :D.

amock avatar May 30 '17 12:05 amock

i will check it now alexander. i appreciate your time and interest in this. Thanks again. Rohit

On Tue, May 30, 2017 at 2:33 PM, Alexander Mock [email protected] wrote:

I created a warp-fix branch. Please try it out. It's hard to figure out what is going wrong when my cuda-gdb output say it is everything okay :D.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/aock/kNNcuda/issues/1#issuecomment-304863928, or mute the thread https://github.com/notifications/unsubscribe-auth/AHrFxuLxWS5QTq7oPMZom-uOFR5_n8Uaks5r_AydgaJpZM4Noj_3 .

itabhiyanta avatar May 30 '17 12:05 itabhiyanta

the warp-fix branch worked alexander. thanks i'll try a bigger point cloud now

itabhiyanta avatar May 30 '17 12:05 itabhiyanta

ok so a 2M point cloud again seems to end up with the same error

rohit@rohit-desktop-work:~/Documents/bruteforceNN/kNNcuda/build$ cuda-gdb ./Example NVIDIA (R) CUDA Debugger 8.0 release Portions Copyright (C) 2007-2016 NVIDIA Corporation GNU gdb (GDB) 7.6.2 Copyright (C) 2013 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-unknown-linux-gnu". For bug reporting instructions, please see: http://www.gnu.org/software/gdb/bugs/... Reading symbols from /home/rohit/Documents/bruteforceNN/kNNcuda/build/Example...done. (cuda-gdb) run ./Example Starting program: /home/rohit/Documents/bruteforceNN/kNNcuda/build/./Example ./Example [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Reading file /home/rohit/Documents/bruteforceNN/xyz.ply ... Finished reading file. 2000000 Constructing kd-tree... [New Thread 0x7ffff5f02700 (LWP 5681)] Finished kd-tree construction. Start Normal Calculation... [New Thread 0x7ffff3d7b700 (LWP 5691)] [New Thread 0x7ffff357a700 (LWP 5692)]

CUDA Exception: Warp Illegal Address The exception was triggered at PC 0x394d538

Program received signal CUDA_EXCEPTION_14, Warp Illegal Address. [Switching focus to CUDA kernel 0, grid 1, block (7679,0,0), thread (0,0,0), device 0, sm 0, warp 12, lane 0] 0x000000000394d570 in KNNKernel(PointArray, PointArray, PointArray, int, int)<<<(62500,1,1),(32,1,1)>>> () (cuda-gdb)

itabhiyanta avatar May 30 '17 12:05 itabhiyanta

Hi Rohit, I guess I can't figure out the bug in the code when I can't test the new fixes. If you can repair it please let me know! Also, if I have a smart idea and see a wrong memory access or warp in the code I will update the warp-fix branch. Greetings Alex

amock avatar May 31 '17 09:05 amock

Thanks Alexander

I will look into it. If i am able to find a fix I'll inform you.

with kind regards Rohit

On Wed, May 31, 2017 at 11:26 AM, Alexander Mock [email protected] wrote:

Hi Rohit, I guess I can't figure out the bug in the code when I can't test the new fixes. If you can repair it please let me know! Also, if I have a smart idea and see a wrong memory access or warp in the code I will update the warp-fix branch. Greetings Alex

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/aock/kNNcuda/issues/1#issuecomment-305136313, or mute the thread https://github.com/notifications/unsubscribe-auth/AHrFxkkmhXIZlJxy71OqhxcR1yrWALKeks5r_TJdgaJpZM4Noj_3 .

itabhiyanta avatar May 31 '17 13:05 itabhiyanta

Had the same error. Just removed the HANDLE_ERROR() while doing the memcpy in line 912 and it worked. Problem:

HANDLE_ERROR( cudaMemcpy(Result_Normals.elements, D_Result_Normals.elements, size, cudaMemcpyDeviceToHost ) );

After removing Handle_Error which is not an ideal thing to do I guess, but it works

 cudaMemcpy(Result_Normals.elements, D_Result_Normals.elements, size, cudaMemcpyDeviceToHost  );

ahmadhasan2k8 avatar Feb 14 '18 03:02 ahmadhasan2k8

Hello,

I think I figured it out. Because of the fact that the algorithm uses cuda dynamic parallelism, you have to add an extra set command in the CmakeLists.txt, which indicates this property:

set(CUDA_SEPARABLE_COMPILATION TRUE)

I hope this helps everyone

ktsiolis91 avatar Jun 10 '19 13:06 ktsiolis91