worse chainer convnet-benchmarks performance on cupy-2.0.0 as compared to cupy-1.0.0.1
Hello, would you please help explain this issue? Thanks in advance. We found that convnet-benchmarks performance on cupy-2.0.0 is worse than that on cupy-1.0.0.1. We don't know whether it is problem of cupy or convnet-benchmarks scripts. We reported this issue in https://github.com/cupy/cupy/issues/753, got no response yet.
---------------------details-------------------------- Test Environment: P100 Test action: 1, install chainer 2, get convnet-benchmarks code: git clone https://github.com/mitmul/convnet-benchmarks 3, test cases 3.1: case "pip install cupy==1.0.0.1" (py2-chainer-gpu) [sys_dltest@mlt-gpu200 chainer]$ python train_imagenet.py alexnet ('Chainer version:', '2.0.0b1') ('CuPy version:', '1.0.0.1') ('CUDA:', True) ('CUDA Version:', u'V8.0.61') ('cuDNN:', True) ('cuDNN Version:', 5110) ('Input data shape:', (128, 3, 224, 224)) ('Average Forward: ', 16.15312328338623, ' ms') ('Average Backward: ', 35.27830085754395, ' ms') ('Average Total: ', 51.431424140930176, ' ms')
3.2: case "pip install cupy==2.0.0" (py2-chainer-gpu) [sys_dltest@mlt-gpu200 chainer]$ python train_imagenet.py alexnet ('Chainer version:', '2.0.0b1') ('CuPy version:', '2.0.0') ('CUDA:', True) ('cuDNN:', True) ('cuDNN Version:', 5110) ('Input data shape:', (128, 3, 224, 224)) ('Average Forward: ', 35.381299591064455, ' ms') ('Average Backward: ', 63.26389694213867, ' ms') ('Average Total: ', 98.64519653320312, ' ms')
3.3: case "pip install cupy==2.0.0rc1" (py2-chainer-gpu) [sys_dltest@mlt-gpu200 chainer]$ python train_imagenet.py alexnet ('Chainer version:', '2.0.0b1') ('CuPy version:', '2.0.0rc1') ('CUDA:', True) ('cuDNN:', True) ('cuDNN Version:', 5110) ('Input data shape:', (128, 3, 224, 224)) ('Average Forward: ', 35.5438117980957, ' ms') ('Average Backward: ', 63.336796569824216, ' ms') ('Average Total: ', 98.88060836791992, ' ms')
Notice: when run "case cupy==2.0.0*", you need to comment following lines in train_imagenet.py. #if chainer.cuda.available:
cuda_v = cupy.cuda.compiler._get_nvcc_version().split()[-1].decode('utf-8')
print('CUDA Version:', cuda_v)
seems that I went to wrong place, I meant to go to https://github.com/mitmul/convnet-benchmarks. sorry.
Seems that the convenet benchmark performance turns up to normal after we upgrade cupy to '3.0.0a1'.