crfasrnn icon indicating copy to clipboard operation
crfasrnn copied to clipboard

Process killed

Open AlexTS1980 opened this issue 7 years ago • 0 comments

I'm running the test script to segment objects in an 800x800 image. I use Tesla K40 graphics card as a GPU processor. Why do I keep getting message that the process is Killed? Sometimes it works though, but it takes a long time (10-20 seconds) to process it. Here's the output for CUDA deviceQuery, I also install CuDNN 3.0.8 (the only one that worked). Any suggestions?

Device 0: "Tesla K40m" CUDA Driver Version / Runtime Version 7.5 / 7.5 CUDA Capability Major/Minor version number: 3.5 Total amount of global memory: 11520 MBytes (12079136768 bytes) (15) Multiprocessors, (192) CUDA Cores/MP: 2880 CUDA Cores GPU Max Clock rate: 745 MHz (0.75 GHz) Memory Clock rate: 3004 Mhz Memory Bus Width: 384-bit L2 Cache Size: 1572864 bytes Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096) Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total number of registers available per block: 65536 Warp size: 32 Maximum number of threads per multiprocessor: 2048 Maximum number of threads per block: 1024 Max dimension size of a thread block (x,y,z): (1024, 1024, 64) Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535) Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and kernel execution: Yes with 2 copy engine(s) Run time limit on kernels: No Integrated GPU sharing Host Memory: No Support host page-locked memory mapping: Yes Alignment requirement for Surfaces: Yes Device has ECC support: Enabled Device supports Unified Addressing (UVA): Yes Device PCI Domain ID / Bus ID / location ID: 0 / 4 / 0 Compute Mode: < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

Device 1: "NVS 315" CUDA Driver Version / Runtime Version 7.5 / 7.5 CUDA Capability Major/Minor version number: 2.1 Total amount of global memory: 1023 MBytes (1072889856 bytes) ( 1) Multiprocessors, ( 48) CUDA Cores/MP: 48 CUDA Cores GPU Max Clock rate: 1046 MHz (1.05 GHz) Memory Clock rate: 875 Mhz Memory Bus Width: 64-bit L2 Cache Size: 65536 bytes Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65535), 3D=(2048, 2048, 2048) Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total number of registers available per block: 32768 Warp size: 32 Maximum number of threads per multiprocessor: 1536 Maximum number of threads per block: 1024 Max dimension size of a thread block (x,y,z): (1024, 1024, 64) Max dimension size of a grid size (x,y,z): (65535, 65535, 65535) Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and kernel execution: Yes with 1 copy engine(s) Run time limit on kernels: No Integrated GPU sharing Host Memory: No Support host page-locked memory mapping: Yes Alignment requirement for Surfaces: Yes Device has ECC support: Disabled Device supports Unified Addressing (UVA): Yes Device PCI Domain ID / Bus ID / location ID: 0 / 3 / 0 Compute Mode: < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

Peer access from Tesla K40m (GPU0) -> NVS 315 (GPU1) : No Peer access from NVS 315 (GPU1) -> Tesla K40m (GPU0) : No deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 7.5, CUDA Runtime Version = 7.5, NumDevs = 2, Device0 = Tesla K40m, Device1 = NVS 315

AlexTS1980 avatar Nov 18 '16 13:11 AlexTS1980