benchmarks icon indicating copy to clipboard operation
benchmarks copied to clipboard

model VGG-11、16、19 accuracy do not increase

Open wx1111 opened this issue 6 years ago • 1 comments

hi, I run benchmark on my cluster with model VGG-11、16、19 in distributed mode(1 ps and 4 worker). The accuracy do not increase. the optimizer settings are: optimizer : rmsprop init_learning_rate : 0.01 num_epochs_per_decay = 5 learning_rate_decay_factor=0.95 momentum=0.9 the batch_size is default value. And the accuracy are following: vgg11: Step Img/sec total_loss top_1_accuracy top_5_accuracy 9500 images/sec: 31.4 +/- 0.0 (jitter = 0.7) 6.897 0.000 0.008 9600 images/sec: 31.4 +/- 0.0 (jitter = 0.7) 6.909 0.000 0.004 9700 images/sec: 31.4 +/- 0.0 (jitter = 0.7) 6.909 0.000 0.004 9800 images/sec: 31.4 +/- 0.0 (jitter = 0.7) 6.910 0.000 0.004 9900 images/sec: 31.4 +/- 0.0 (jitter = 0.7) 6.917 0.004 0.004 10000 images/sec: 31.4 +/- 0.0 (jitter = 0.7) 6.903 0.000 0.000 10100 images/sec: 31.4 +/- 0.0 (jitter = 0.7) 6.910 0.000 0.008

The vgg16 vgg19 are almost the same accuracy.

any ideas? thanks a lot!

wx1111 avatar Aug 13 '18 09:08 wx1111

Unfortunately, no one is working on either distributed convergence/performance or the VGG. So we currently have no one to look into this. At the momentum, our focus is on single-machine resnet50.

reedwm avatar Aug 27 '18 23:08 reedwm