mtcnn
mtcnn copied to clipboard
train PNet is so slow
when I run python example/train_P_net.py --gpus 0 , My GPU is 1070 INFO:root:Epoch[0] Batch [200] Speed: 123.25 samples/sec Train-Accuracy=0.697969 INFO:root:Epoch[0] Batch [200] Speed: 123.25 samples/sec Train-LogLoss=0.617246 INFO:root:Epoch[0] Batch [200] Speed: 123.25 samples/sec Train-BBOX_MSE=0.103584 can you help me ? this is a wrong ? Where is the mistake?thx
you need put your data in SSD disk
@xiaoxiongli thank you, how much time in your PC, What is the configuration of your PC? thx
@tzhang2014 i also meet this problem, how did you improve it?
INFO:root:Epoch[0] Batch [200] Speed: 126.56 samples/sec Train-Accuracy=0.697195 INFO:root:Epoch[0] Batch [200] Speed: 126.56 samples/sec Train-LogLoss=0.614800 INFO:root:Epoch[0] Batch [200] Speed: 126.56 samples/sec Train-BBOX_MSE=0.106309
Only the first round is slow, the other is very fast.
You can change mxnet's environment variables to speed training ,just like cmd : export MXNET_GPU_WORKER_NTHREADS=4 (default = 2) and : export MXNET_GPU_COPY_NTHREADS=4 (default = 1) . after i did it , every thing became better
eg : i7-7700 gtx1060 INFO:root:Epoch[0] Batch [3780] Speed: 8343.78 samples/sec Accuracy=0.898810 LogLoss=0.270442 BBOX_MSE=0.015827 INFO:root:Epoch[0] Batch [3800] Speed: 9112.26 samples/sec Accuracy=0.891901 LogLoss=0.282063 BBOX_MSE=0.015802 INFO:root:Epoch[0] Batch [3820] Speed: 10172.07 samples/sec Accuracy=0.883745 LogLoss=0.303172 BBOX_MSE=0.015691 INFO:root:Epoch[0] Batch [3840] Speed: 10388.03 samples/sec Accuracy=0.878459 LogLoss=0.288958 BBOX_MSE=0.015310 INFO:root:Epoch[0] Batch [3860] Speed: 9720.13 samples/sec Accuracy=0.885983 LogLoss=0.310603 BBOX_MSE=0.015680 INFO:root:Epoch[0] Batch [3880] Speed: 9980.33 samples/sec Accuracy=0.879565 LogLoss=0.300225 BBOX_MSE=0.016198
@linsoncvw After 1 epoch ,the speed is so fast. I don't understand the reason
Did you meet "Cannot find argument 'out_grad'" when using train_P_net.py?
@geoffzhang I met the same problem,did you fix it?
@geoffzhang @EmiPark delete all 'out_grad=True' in core\symbol.py
@geoffzhang @EmiPark delete all 'out_grad=True' in core\symbol.py delete "out_grad = True",whether it has an impact on training?