Semantic-Segmentation-Suite icon indicating copy to clipboard operation
Semantic-Segmentation-Suite copied to clipboard

ResourceExhaustedError when using batch_size >= 64

Open vanqm opened this issue 6 years ago • 3 comments


Information

  • What are your command line arguments?:
    • Dataset: CamVid
    • Batch size: 64
    • Crop height/width: 224
  • Have you written any custom code?: No
  • What have you done to try and solve this issue?: tried using batch size < 64, and there is no ResourceExhaustedError happens. But I want to use the batch size > 64 for other dataset
  • TensorFlow version?: TensorFlow-GPU 1.9.0

Describe the problem

  • ResourceExhaustedError happens when using batch size=64
  • Using 'watch -n 0.5 nvidia-smi', and see that GPU:0 memory is exhausted.
  • I have 2 GPU, but why just GPU:0 is used?
  • How about GPU:1?

Thanks, Van

vanqm avatar Dec 25 '18 07:12 vanqm

Hello, Van Till now, this project doesn't support multi-GPU task. This means it would only use one GPU, even you have more. I guess it's because this code is based on tf.slim, and it's intricate to do multi-GPU with tf.slim. If you do want to use multi-GPU with slim, you can refer to this official example.

Spritea avatar Dec 26 '18 02:12 Spritea

Hi Spritea,

Thank you for your comment. I will review the source code carefully and see what can I do.

vanqm avatar Jan 03 '19 01:01 vanqm

Hi i am a bit stack, I get this error with the default setting and also even when i put a batchsize = 1 , tensorflow 1.10.0 , python 3.5.6 anaconda

burhr2 avatar Feb 18 '19 16:02 burhr2