FastMaskRCNN icon indicating copy to clipboard operation
FastMaskRCNN copied to clipboard

GPU OOM during init_op

Open astromme opened this issue 7 years ago • 2 comments

astromme@Snowbank:~/Code/FastMaskRCNN$ python train/train.py
P2
P3
P4
P5
anchor_scales =  [8, 16, 32]
anchor_scales =  [4, 8, 16]
anchor_scales =  [2, 4, 8]
anchor_scales =  [1, 2, 4]
5
4
3
2
/home/astromme/miniconda2/lib/python2.7/site-packages/tensorflow/python/ops/gradients_impl.py:93: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
2017-07-12 14:27:28.710538: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-12 14:27:28.710557: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-12 14:27:28.710573: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-07-12 14:27:28.710578: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-12 14:27:28.710582: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2017-07-12 14:27:28.827137: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2017-07-12 14:27:28.827477: I tensorflow/core/common_runtime/gpu/gpu_device.cc:887] Found device 0 with properties: 
name: GeForce GTX 1070
major: 6 minor: 1 memoryClockRate (GHz) 1.7715
pciBusID 0000:01:00.0
Total memory: 7.92GiB
Free memory: 7.05GiB
2017-07-12 14:27:28.827489: I tensorflow/core/common_runtime/gpu/gpu_device.cc:908] DMA: 0 
2017-07-12 14:27:28.827492: I tensorflow/core/common_runtime/gpu/gpu_device.cc:918] 0:   Y 
2017-07-12 14:27:28.827511: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1070, pci bus id: 0000:01:00.0)
2017-07-12 14:27:28.828725: E tensorflow/stream_executor/cuda/cuda_driver.cc:893] failed to allocate 7.13G (7654381824 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY

From here on it appears to keep going, but I think something isn't initialized properly.

astromme avatar Jul 12 '17 21:07 astromme

https://github.com/astromme/FastMaskRCNN/commit/24cf5dd043921f735dcd9ea5a620fcdee2661d47 seems to fix it.

astromme avatar Jul 12 '17 21:07 astromme

for failed to allocate 7.13G (7654381824 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY,

you can set this code: gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.85) in train () function to allocate smaller GPU memory for train

landiaokafeiyan avatar Nov 08 '17 02:11 landiaokafeiyan