Fabian Tschopp

Results 119 comments of Fabian Tschopp

Can I see the implementation? Can you give me the convolution parameters and how long it takes to compute it? Make sure when taking the time of just one OpenCL...

> Not yet, but rather soon ;-) Most likely in 2 weeks or a month. Ok, the more I know the better I can help with performance related questions. Alternatively...

Resnet50 can be obtained here, among other places: https://github.com/cvjena/cnn-models/tree/master/ResNet_preact/ResNet50_cvgj You can do a precise layer-wise benchmark like this: `./build/tools/caffe time -lt -gpu=0 -iterations=5 -model deploy.prototxt` You should also edit the...

I've got the following performance numbers (batch size 16, average forward pass): - GTX 1080 @ CUDA+cuBLAS+cuDNN: 87.29 ms (5.45 ms per image) - GTX 1080 @ OpenCL+CLBlast+LibDNN: 132.053 ms...

One factor could be memory allocation and total memory use. Batch size 32 certainly would not fit onto on-GPU memory (2GB) of your Radeon RX 555 in the case of...

@romix That's very little for ResNet-50. Do you re-use the buffers during inference-only forward passes? This: https://github.com/beniz/deepdetect/issues/84 And this: 7341467648 B (approx. 6.8 GB) is required to train ResNet-50 on...

Thanks, I'll check it out :) Btw. there's now also aggressive buffer reuse as an optional flag for networks in Caffe.

https://github.com/naibaf7/caffe/commit/411defea28780d437360dd9ed38ff6dd51903d62 Not in mainline Caffe yet, sorry. But hopefully soon.

@edgarriba Yes it is possible to add OpenCL and CUDA tests with travisCI. We'll have to look into this, but I'd need to add reference CPU kernels for testing first.

@edgarriba These tests won't help that much I think, as they are not representative of real world issues (such as linking an actual framework to libdnn, or getting it to...