tf-faster-rcnn
                                
                                 tf-faster-rcnn copied to clipboard
                                
                                    tf-faster-rcnn copied to clipboard
                            
                            
                            
                        ResourceExhaustedError
my gpu is GTX 1060 , and the demo has been successfully run . but when try to train ,something has gone wrong. I used this command : ./experiments/scripts/train_faster_rcnn.sh 0 pascal_voc vgg16 who can help me? thanks!
ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[25088,4096] [[Node: gradients/vgg_16_2/fc6/kernel/Regularizer/l2_regularizer/L2Loss_grad/mul = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](vgg_16/fc6/weights/read, gradients/vgg_16_2/fc6/kernel/Regularizer/l2_regularizer_grad/tuple/control_dependency_1)]] [[Node: LOSS_default/add_5/_253 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_1113_LOSS_default/add_5", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
who can help me、、、
same problem here
You can reduce the memory usage of the model by reducing the size of some parameters.
I will mention you some parameters try changing one at a time or combinations. In the best case, you should minimize reducing the sizes as much as possible.
in lib/model/config.py file __C.TRAIN.SCALES = (600,) __C.TRAIN.MAX_SIZE = 1000
and in experiments/cfgs/vgg16.yml RPN_BATCHSIZE: 256 BATCH_SIZE: 256
Thank you but the problem remains. It appens during the loading of the weights from vgg16.ckpt
Il giorno sab 2 feb 2019, 19:48 Naga Sandeep Ramachandruni < [email protected]> ha scritto:
You can reduce the memory usage of the model by reducing the size of some parameters.
I will mention you some parameters try changing one at a time or combinations. In the best case, you should minimize reducing the sizes as much as possible.
in lib/model/config.py file __C.TRAIN.SCALES = (600,) __C.TRAIN.MAX_SIZE = 1000
and in experiments/cfgs/vgg16.yml RPN_BATCHSIZE: 256 BATCH_SIZE: 256
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/endernewton/tf-faster-rcnn/issues/416#issuecomment-459989327, or mute the thread https://github.com/notifications/unsubscribe-auth/ASebNL4TQNLNI5diBB_3BxnwUkkn0ZO9ks5vJd1kgaJpZM4Z8_cF .
Obviously, it will be when loading of weights. Try with res50. The gpu you were using doesn't have enough memory to train these networks. What is the memory of your GPU?
I solve this problem by using res101 pre-weight replace vgg16, good luck!
I solve this problem by rebooting the terminal.